Data looping in classification dict generation functions #110

TomHall2020 · 2024-12-17T23:39:54Z

As discussed in #103, the classification data generation functions can be unindented quite nicely by replacing direct iteration over bowstyle/gender/age divisions with the itertools product of those.

Implementing that was very straightforward, but while I was in there I did smell an opportunity to excise another bit of looping, where the class handicap thresholds and minimum distances themselves are built up by manual iteration over the indecies of the classification labels. That took a bit more effort, but I think is much more explicit this way, and the result is that the loop building up the classification data is now pretty much solely packaging things nicely, and has a minimum of inline logic.

Inside the agb_outdoor _assign_min_dist function I could come up with various ways to build up the required indecies based on the number of mb_categories and adjustments etc, but in the end I felt that just declaring it completely explicitly was the most readable and clear way to do it. The use of np.take with mode="clip" works really nicely to get rid of the try/except index error handling.

There are still some magic numbers (1 and 2) for offsetting the indecies when using the age/gender steps but since I have absolutely no idea why they were there before I can't do much do make them more explicit here.

No tests broken, so just up to you @jatkinson1000 if you're happy with the new division of logic?

jatkinson1000

Thanks @TomHall2020
Itertools definitely makes it cleaner.

I added a couple of comment suggestions to make it clearer to myself (np.take() is new to me) and I think one of the docstrings needs updating in the field routines.
Then should be good to go.

More generally, where min_dist functions returns ints, I wonder if these should be floats.
They are internal functions and in reality it makes not difference to the functioning of the code, but philosophically if we are comparing them to yard values in some places they should perhaps be floats.
Any thoughts?

archeryutils/classifications/agb_field_classifications.py

archeryutils/classifications/agb_outdoor_classifications.py

jatkinson1000 · 2024-12-18T16:56:49Z

And to answer your question about the magic numbers @TomHall2020, when developing the outdoor classifications we decided to pin the MB handicap for each bowstyle and then set things relative to that. Hence the -2 outdoors and -1 indoors.

This actually turned out to be a real pain in the ass when it came to developing the field classifications, and I wish we had fixed the EMB point, but it seemed like a good idea at the time. IIRC the suggestion was that MB and below was likely to remain fixed, given the desire for a clear "system of progression" at the lower levels and more substantial data, whereas GMB and EMB may need to be tuned over time. Whether this ever happens we'll see...

Would happily discuss over a coffee/beer/etc.

… operations draft 1 replace inner loop on agb_outdoor classification data generation draft 2 replace internal loops on agb 2023 outdoors incomplete, old implementation still present draft 3 complete replacement of inner loops draft 4 vectorise handicap caluclation for field and indoor agb2023 classifications draft 5 refactor distance calculations on agb_field to inside dedicate distance function

Adds additional clarification to more obscure functions/inputs.

jatkinson1000

This is a useful addition so made the changes and rebased on main to merge.

codecov · 2025-01-25T14:01:48Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.80%. Comparing base (4aab80f) to head (e65e973).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #110      +/-   ##
==========================================
- Coverage   97.82%   97.80%   -0.02%     
==========================================
  Files          30       30              
  Lines        1748     1733      -15     
==========================================
- Hits         1710     1695      -15     
  Misses         38       38

Files with missing lines	Coverage Δ
...utils/classifications/agb_field_classifications.py	`100.00% <100.00%> (ø)`
...tils/classifications/agb_indoor_classifications.py	`100.00% <100.00%> (ø)`
...ils/classifications/agb_outdoor_classifications.py	`100.00% <100.00%> (ø)`
...cheryutils/classifications/classification_utils.py	`88.63% <100.00%> (ø)`

TomHall2020 force-pushed the data-looping branch from aabc56b to b195f5e Compare December 17, 2024 23:47

jatkinson1000 requested changes Dec 18, 2024

View reviewed changes

archeryutils/classifications/agb_field_classifications.py Outdated Show resolved Hide resolved

archeryutils/classifications/agb_outdoor_classifications.py Outdated Show resolved Hide resolved

archeryutils/classifications/agb_outdoor_classifications.py Show resolved Hide resolved

jatkinson1000 mentioned this pull request Dec 18, 2024

Outdoor legacy classifications and generic classification API #103

Open

11 tasks

TomHall2020 and others added 5 commits January 25, 2025 13:53

Replace triple nested for loops with itertools.product

7cb842b

typing fixes

a2b020a

Update docs in _assign_dists of field classifications.

2c07d40

Update distances from data files to be floats rather than integers.

dd9c0e8

jatkinson1000 force-pushed the data-looping branch from e152652 to eeb030a Compare January 25, 2025 13:54

Apply commenting from code review

e65e973

Adds additional clarification to more obscure functions/inputs.

jatkinson1000 force-pushed the data-looping branch from eeb030a to e65e973 Compare January 25, 2025 13:56

jatkinson1000 approved these changes Jan 25, 2025

View reviewed changes

jatkinson1000 merged commit f357c61 into jatkinson1000:main Jan 25, 2025
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data looping in classification dict generation functions #110

Data looping in classification dict generation functions #110

TomHall2020 commented Dec 17, 2024

jatkinson1000 left a comment

jatkinson1000 commented Dec 18, 2024

jatkinson1000 left a comment

codecov bot commented Jan 25, 2025

Data looping in classification dict generation functions #110

Data looping in classification dict generation functions #110

Conversation

TomHall2020 commented Dec 17, 2024

jatkinson1000 left a comment

Choose a reason for hiding this comment

jatkinson1000 commented Dec 18, 2024

jatkinson1000 left a comment

Choose a reason for hiding this comment

codecov bot commented Jan 25, 2025

Codecov Report