-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contains operator #99
Conversation
…ytes to compare with bytes_lit
filtergen/src/lib.rs
Outdated
//! | `<` | `lt` | Less than | `ipv6.payload_length < 500` | | ||
//! | `in` | | In a range, or in a subnet | `ipv4.src_addr in 1.2.3.4/16` | | ||
//! | `~` | `matches` | Regular expression match | `tls.sni ~ 'netflix\\.com$'` | | ||
//! | Operator | Alias | Description | Example | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a huge deal, but do you know what changed here to cause this diff? Seems like there shouldn't actually be a change here.
filtergen/src/lib.rs
Outdated
//! | `<` | `lt` | Less than | `ipv6.payload_length < 500` | | ||
//! | `in` | | In a range, or in a subnet | `ipv4.src_addr in 1.2.3.4/16` | | ||
//! | `~` | `matches` | Regular expression match | `tls.sni ~ 'netflix\\.com$'` | | ||
//! | `contains` | | Check if right appears in left | `tls.sni contains \|2E 63 6F 6D\|` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the \
required to indicate bytes?
filtergen/src/utils.rs
Outdated
let bytes_lit = syn::LitByteStr::new(b, Span::call_site()); | ||
let num_bytes = bytes_lit.value().len(); | ||
quote! { | ||
#proto.#field().as_bytes().windows(#num_bytes).any(|w| w == #bytes_lit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that we talked about this and I said windows
seemed like a good idea, but I just found the following crates that look like they would be more efficient. I think these would work for both strings and bytes.
- https://docs.rs/twoway/latest/twoway/ -- looks like two-way string matching will be a lot faster for longer strings in particular, though I'm not sure what "longer" means.
- https://docs.rs/memchr/latest/memchr/memmem/fn.find.html -- I'm not sure if this requires specialized HW support to actually be faster though? Based on the documentation, it looks like initializing one
Finder
and adding it to statics could also be faster.
Could you do a bit of research into these and figure out which one makes the most sense? I don't know that we have to go into depth profiling them, but worth taking a look.
This PR implements support for a
contains
operator that can be used with strings or bytes. When used with bytes, the protocol field of the filter can be a string or a byte (if it's a string, it gets converted to a byte before comparison is done).