Define a sliding inclusion zone to exclude matches "too far in time" #320
Replies: 2 comments 7 replies
-
@javierdvalle Welcome to the STUMPY community and thank you for your question. I can see how this feature could be very useful in certain use cases. Unfortunately, to be consistent with the originally published work, we have chosen a default exclusion zone and there is currently no way to do what you are asking. However, if you can provide a little more detail about your use case, the time series size/length, and the frequency of your data then I ma be able to offer an alternative solution. |
Beta Was this translation helpful? Give feedback.
-
@javierdvalle Thank you for the context.
Wonderful!
Technically, no. I think that what matrix profiles are really great at is when you don't know if there are any conserved behaviors in your data and to find the nearest neighbor for each subsequence in an unconstrained way. Thus, a matrix profile will compare monday with monday, monday with tuesday, monday with wednesday, and so on and return the absolute best match. This is an exhaustive search. In your case, it sounds like you have an idea of where things may be conserved/repeating. There are a few non-traditional ideas that come to mind (and please forgive me as they are not well thought out or are poorly formulated in my head): Let's say you want to compare full weeks (7 days, starting on a monday and ending on a sunday) and align the days of the week. What you can do is insert Let me know if that makes sense?
I think it's important to point out that matrix profiles are really good at identifying repeating shapes and not great at finding single point peaks or anomalies (due to z-normalization of the subsequences). I find that a 7 day window is really pushing it and might even prefer a 14 day window as 7 data points isn't that much data to define a "shape". On the flip side, since you have daily data (which is "small"), you might just be able to run Let's say you have 365 days worth of data and a month is considered 30 days. In your original post, you stated that you wanted to exclude any matches beyond, say, two months (i.e., 60 days), then you can do something like:
This isn't super pretty (and admittedly hacky) but you are basically extracting out one week query chunks, I hope that gets you going in the right direction but let me know if I can help clarify anything! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am new to Stumpy. I have read in the tutorial about the exclusion zone relative to the diagonal to exclude trivial matches. I am not sure if this option exists because I've not seen it in the API docs.
But anyway, I would like to know if there exists the opposite option: defining a sliding inclusion zone to only include matches inside the inclusion zone, and exclude matches outside (too far in time).
For example, with a daily time serie I would like to exclude matches beyond two months.
Beta Was this translation helpful? Give feedback.
All reactions