Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Get Unique Values of Columns Method #38

Open
juaansb01 opened this issue Nov 6, 2024 · 0 comments
Open

Add Get Unique Values of Columns Method #38

juaansb01 opened this issue Nov 6, 2024 · 0 comments

Comments

@juaansb01
Copy link

🚀 Feature Request

Is your feature request related to a problem? Please describe.

  • In our team, the problem has arisen when we are asked to cross-reference the value received in the API with the existing values in the database. This can be done by going through all the rows, extracting the specific value of that column, and then using the resulting list of all those values, extracting the unique ones and checking it.

Describe the solution you'd like

  • In the way I have described I suppose it would work, but it seemed to me to be an expensive option. So I thought it would be nice to implement a method (in src/repository/crud/base.py) that would directly get the unique values of the table column we want.

Describe alternatives you've considered

  • I had also contemplated the possibility of allowing to exclude null values or empty strings, as this is something that in our case we have been asked not to contemplate.

Additional context

  • I leave here the code that we have written, and that in my case has worked. I'm open to any comments or improvements you see to make it as good as possible.
def get_unique_values(
        self: "CRUDBase[ModelType]", db: Session, column_name: str, include_nulls: bool = False
    ) -> list[Any]:
        """Get unique values of a specific column in the model's table, with an option to include
        or exclude NULL values.

        Args:
            db (Session): Database session.
            column_name (str): Name of the column to retrieve unique values from.
            include_nulls (bool): If False, excludes NULL or empty string values. Defaults to False.

        Returns:
            list[Any]: List of unique values of the specified column.

        Raises:
            AttributeError: If the specified column does not exist in the model.
            OperationalError: If an error occurs during the database operation.
        """
        logger.info("Entering...")
        logger.debug(
            "Retrieving unique values from column '%s' in %s", column_name, self.model.__name__
        )

        # Verify if the column exists in the model
        if not hasattr(self.model, column_name):
            error_message = f"Column '{column_name}' does not exist in {self.model.__name__}."
            logger.error(error_message)
            logger.info("Exiting...")
            raise AttributeError(error_message)

        # Retrieve unique values, with optional filtering of NULLs and empty strings
        try:
            column = getattr(self.model, column_name)
            query = select(column).distinct()

            # Exclude NULL or empty string values if include_nulls is False
            if not include_nulls:
                query = query.where(column.isnot(None)).where(column != "")

            unique_values = db.scalars(query).all()
            logger.debug("Unique values found: %s", unique_values)
            return unique_values
        except OperationalError:
            db.rollback()
            logger.exception(
                "Failed to retrieve unique values from column '%s' in %s",
                column_name,
                self.model.__name__,
            )
            raise
        finally:
            logger.info("Exiting...")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant