Skip to content

Filters based on Columns doesn't work if original column names are in upper case #1254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kmehomiyski opened this issue May 20, 2019 · 0 comments

Comments

@kmehomiyski
Copy link

kmehomiyski commented May 20, 2019

When using column name filter such as include <database>.<table>.<column_name> = <value>, the filters fails. The reason is the following:

In com.zendesk.maxwell.filtering.Filter#couldIncludeFromColumnFilters(String database, String table, Set<String> columns) method, the columns argument have all database column names in lower case (they are always in lowercase, despite the actual column names are in upper case). Argument values are passed to this method from com.zendesk.maxwell.replication.BinlogConnectorReplicator#shouldOutputEvent(String database, String table, Filter filter, Set<String> columnNames) method, which in turn gets table column names from com.zendesk.maxwell.schema.TableColumnList#columnNames() which build column names with the code:

columnNames.add(cf.getName().toLowerCase().intern());

This means that if we want the raw with a column with a given value to be included, the name of the column has to be specified in lower case (despite the actual name of the column).

On the other hand, the final decision to include or exclude the row is made in com.zendesk.maxwell.filtering.Filter#includes(java.lang.String, java.lang.String, java.util.Map<java.lang.String,java.lang.Object>) method. This method is called from com.zendesk.maxwell.replication.BinlogConnectorReplicator#shouldOutputRowMap(String database, String table, RowMap rowMap, Filter filter) method, which in turns gets table column names from com.zendesk.maxwell.row.RowMap#getData(). In this method, original column names are used. This means that we should use original column names in the filter specification, if we want them to pass this check. The two methods (com.zendesk.maxwell.filtering.Filter#couldIncludeFromColumnFilters(String database, String table, Set<String> columns) and com.zendesk.maxwell.filtering.Filter#includes(java.lang.String, java.lang.String, java.util.Map<java.lang.String,java.lang.Object>)) use different syntaxis for filter specification, which is mutually exclusive, however the two methods need to work in unison.

I'm proposing a fix to this problem: do not use .toLowerCase() in com.zendesk.maxwell.schema.TableColumnList#columnNames() method. An example of the fix can be found here

A pull request was also created.

There are no additional test cases added, that can reproduce the problem or anything else that makes it easy to confirm the fix, but it should be fairly simple to test it with any local database (that's what I did). I'm willing to work on this pull request, as this feature is very important for our team (I guess for others too), and we would like to avoid maintaining a fork. If there is anything else I need to do please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant