-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make stats and string matching case-insensitive #9
base: master
Are you sure you want to change the base?
Conversation
@@ -41,7 +41,7 @@ func (m LogMatcher) ExtractInfo(line string) LogMatch { | |||
if len(match) > 0 { | |||
result.Matched = true | |||
result.QueryClient = match[1] | |||
result.QueryName = match[2] | |||
result.QueryName = strings.ToLower(match[2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I think the idea is great, but I detect a possible bug here. Since BIND will log the query exactly as received, I think we want to fold the entire input down to lower case first or make the match case insensitive. If we don't, we'll have an issue under the following circumstances:
- The list of matches includes "FOOBAR.com" (but this is folded to "foobar.com") by the changes in names_collector.go
- The client queries for "FooBar.com" -> therefore, no match against "foobar"
Conversely, if we fold the whole line down to lower case (maybe on line 39 with line = strings.ToLower(line)
, then we'll have a successful match. Since DNS RFC says that DNS names are not case sensitive, a non-case match is indeed a match... I think this is fine.
To me, it feels like it's more obvious to fold to lower case than to check/enforce the regex to have the case-insensitive prepend flag. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, I'm not sure what I missed? Wouldn't my PR make sure that 'foobar.com' is inserted in the array and by ToLower-ing the name when looking it up, isn't this going to cover both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're close - the only issue is that the string coming into this function (in line
) is coming straight from the log file. I did a quick test and confirmed that BIND will log the value exactly as it is received. So, earlier on in the PR, you're correctly coaxing the search strings to lower case - but at this point, the lower case string is compared to a mixed case value in line
if the user queried for "FoObAr". So I'm thinking that if we convert line
to lower case around line 39, it'd be all set.
New: Strings read from the inc/excl files are rendered lowercase when adding to the internal lists. Names from log are lowercased at ingestion.
As it was, stats were produced for each variation of each name, in a case sensitive manner. This balloons the stats massively and makes it less intuitive to make per-name/domain statistics and monitoring. It is assumed that in the vast majority of cases, case is not interesting.