You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using column name filter such as include <database>.<table>.<column_name> = <value>, the filters fails. The reason is the following:
In com.zendesk.maxwell.filtering.Filter#couldIncludeFromColumnFilters(String database, String table, Set<String> columns) method, the columns argument have all database column names in lower case (they are always in lowercase, despite the actual column names are in upper case). Argument values are passed to this method from com.zendesk.maxwell.replication.BinlogConnectorReplicator#shouldOutputEvent(String database, String table, Filter filter, Set<String> columnNames) method, which in turn gets table column names from com.zendesk.maxwell.schema.TableColumnList#columnNames() which build column names with the code:
This means that if we want the raw with a column with a given value to be included, the name of the column has to be specified in lower case (despite the actual name of the column).
On the other hand, the final decision to include or exclude the row is made in com.zendesk.maxwell.filtering.Filter#includes(java.lang.String, java.lang.String, java.util.Map<java.lang.String,java.lang.Object>) method. This method is called from com.zendesk.maxwell.replication.BinlogConnectorReplicator#shouldOutputRowMap(String database, String table, RowMap rowMap, Filter filter) method, which in turns gets table column names from com.zendesk.maxwell.row.RowMap#getData(). In this method, original column names are used. This means that we should use original column names in the filter specification, if we want them to pass this check. The two methods (com.zendesk.maxwell.filtering.Filter#couldIncludeFromColumnFilters(String database, String table, Set<String> columns) and com.zendesk.maxwell.filtering.Filter#includes(java.lang.String, java.lang.String, java.util.Map<java.lang.String,java.lang.Object>)) use different syntaxis for filter specification, which is mutually exclusive, however the two methods need to work in unison.
I'm proposing a fix to this problem: do not use .toLowerCase() in com.zendesk.maxwell.schema.TableColumnList#columnNames() method. An example of the fix can be found here
There are no additional test cases added, that can reproduce the problem or anything else that makes it easy to confirm the fix, but it should be fairly simple to test it with any local database (that's what I did). I'm willing to work on this pull request, as this feature is very important for our team (I guess for others too), and we would like to avoid maintaining a fork. If there is anything else I need to do please let me know.
The text was updated successfully, but these errors were encountered:
When using column name filter such as
include <database>.<table>.<column_name> = <value>
, the filters fails. The reason is the following:In
com.zendesk.maxwell.filtering.Filter#couldIncludeFromColumnFilters(String database, String table, Set<String> columns)
method, thecolumns
argument have all database column names in lower case (they are always in lowercase, despite the actual column names are in upper case). Argument values are passed to this method fromcom.zendesk.maxwell.replication.BinlogConnectorReplicator#shouldOutputEvent(String database, String table, Filter filter, Set<String> columnNames)
method, which in turn gets table column names fromcom.zendesk.maxwell.schema.TableColumnList#columnNames()
which build column names with the code:This means that if we want the raw with a column with a given value to be included, the name of the column has to be specified in lower case (despite the actual name of the column).
On the other hand, the final decision to include or exclude the row is made in
com.zendesk.maxwell.filtering.Filter#includes(java.lang.String, java.lang.String, java.util.Map<java.lang.String,java.lang.Object>)
method. This method is called fromcom.zendesk.maxwell.replication.BinlogConnectorReplicator#shouldOutputRowMap(String database, String table, RowMap rowMap, Filter filter)
method, which in turns gets table column names fromcom.zendesk.maxwell.row.RowMap#getData()
. In this method, original column names are used. This means that we should use original column names in the filter specification, if we want them to pass this check. The two methods (com.zendesk.maxwell.filtering.Filter#couldIncludeFromColumnFilters(String database, String table, Set<String> columns)
andcom.zendesk.maxwell.filtering.Filter#includes(java.lang.String, java.lang.String, java.util.Map<java.lang.String,java.lang.Object>)
) use different syntaxis for filter specification, which is mutually exclusive, however the two methods need to work in unison.I'm proposing a fix to this problem: do not use
.toLowerCase()
incom.zendesk.maxwell.schema.TableColumnList#columnNames()
method. An example of the fix can be found hereA pull request was also created.
There are no additional test cases added, that can reproduce the problem or anything else that makes it easy to confirm the fix, but it should be fairly simple to test it with any local database (that's what I did). I'm willing to work on this pull request, as this feature is very important for our team (I guess for others too), and we would like to avoid maintaining a fork. If there is anything else I need to do please let me know.
The text was updated successfully, but these errors were encountered: