Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data looping in classification dict generation functions #110

Merged
merged 6 commits into from
Jan 25, 2025

Conversation

TomHall2020
Copy link
Contributor

As discussed in #103, the classification data generation functions can be unindented quite nicely by replacing direct iteration over bowstyle/gender/age divisions with the itertools product of those.

Implementing that was very straightforward, but while I was in there I did smell an opportunity to excise another bit of looping, where the class handicap thresholds and minimum distances themselves are built up by manual iteration over the indecies of the classification labels. That took a bit more effort, but I think is much more explicit this way, and the result is that the loop building up the classification data is now pretty much solely packaging things nicely, and has a minimum of inline logic.

Inside the agb_outdoor _assign_min_dist function I could come up with various ways to build up the required indecies based on the number of mb_categories and adjustments etc, but in the end I felt that just declaring it completely explicitly was the most readable and clear way to do it. The use of np.take with mode="clip" works really nicely to get rid of the try/except index error handling.

There are still some magic numbers (1 and 2) for offsetting the indecies when using the age/gender steps but since I have absolutely no idea why they were there before I can't do much do make them more explicit here.

No tests broken, so just up to you @jatkinson1000 if you're happy with the new division of logic?

Copy link
Owner

@jatkinson1000 jatkinson1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TomHall2020
Itertools definitely makes it cleaner.

I added a couple of comment suggestions to make it clearer to myself (np.take() is new to me) and I think one of the docstrings needs updating in the field routines.
Then should be good to go.

More generally, where min_dist functions returns ints, I wonder if these should be floats.
They are internal functions and in reality it makes not difference to the functioning of the code, but philosophically if we are comparing them to yard values in some places they should perhaps be floats.
Any thoughts?

@jatkinson1000
Copy link
Owner

And to answer your question about the magic numbers @TomHall2020, when developing the outdoor classifications we decided to pin the MB handicap for each bowstyle and then set things relative to that. Hence the -2 outdoors and -1 indoors.

This actually turned out to be a real pain in the ass when it came to developing the field classifications, and I wish we had fixed the EMB point, but it seemed like a good idea at the time. IIRC the suggestion was that MB and below was likely to remain fixed, given the desire for a clear "system of progression" at the lower levels and more substantial data, whereas GMB and EMB may need to be tuned over time. Whether this ever happens we'll see...

Would happily discuss over a coffee/beer/etc.

TomHall2020 and others added 5 commits January 25, 2025 13:53
… operations

draft 1 replace inner loop on agb_outdoor classification data generation

draft 2 replace internal loops on agb 2023 outdoors

incomplete, old implementation still present

draft 3 complete replacement of inner loops

draft 4 vectorise handicap caluclation for field and indoor agb2023 classifications

draft 5 refactor distance calculations on agb_field to inside dedicate distance function
Adds additional clarification to more obscure functions/inputs.
Copy link
Owner

@jatkinson1000 jatkinson1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useful addition so made the changes and rebased on main to merge.

@jatkinson1000 jatkinson1000 merged commit f357c61 into jatkinson1000:main Jan 25, 2025
15 checks passed
Copy link

codecov bot commented Jan 25, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.80%. Comparing base (4aab80f) to head (e65e973).
Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #110      +/-   ##
==========================================
- Coverage   97.82%   97.80%   -0.02%     
==========================================
  Files          30       30              
  Lines        1748     1733      -15     
==========================================
- Hits         1710     1695      -15     
  Misses         38       38              
Files with missing lines Coverage Δ
...utils/classifications/agb_field_classifications.py 100.00% <100.00%> (ø)
...tils/classifications/agb_indoor_classifications.py 100.00% <100.00%> (ø)
...ils/classifications/agb_outdoor_classifications.py 100.00% <100.00%> (ø)
...cheryutils/classifications/classification_utils.py 88.63% <100.00%> (ø)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants