Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database with all perfect matches fails due to missing is_extended keys #26

Open
jrober84 opened this issue Jun 6, 2024 · 1 comment

Comments

@jrober84
Copy link
Collaborator

jrober84 commented Jun 6, 2024

Traceback (most recent call last):
File ".conda/envs/profile_dists/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'is_extended'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File ".conda/envs/profile_dists/bin/locidex", line 8, in
sys.exit(main())
File ".conda/envs/profile_dists/lib/python3.9/site-packages/locidex/main.py", line 51, in main
exec('locidex.' + task + '.run()')
File "", line 1, in
File ".conda/envs/profile_dists/lib/python3.9/site-packages/locidex/extract.py", line 272, in run
run_extract(config)
File ".conda/envs/profile_dists/lib/python3.9/site-packages/locidex/extract.py", line 202, in run_extract
exobj = extractor(hit_df,seq_data,sseqid_col='sseqid',queryid_col='qseqid',qstart_col='qstart',qend_col='qend',
File ".conda/envs/profile_dists/lib/python3.9/site-packages/locidex/classes/extractor.py", line 31, in init
loci_ranges = self.group_by_locus(self.df,sseqid_col, queryid_col,qlen_col,extend_threshold_ratio)
File ".conda/envs/profile_dists/lib/python3.9/site-packages/locidex/classes/extractor.py", line 543, in group_by_locus
is_extended = row['is_extended']
File ".conda/envs/profile_dists/lib/python3.9/site-packages/pandas/core/series.py", line 1007, in getitem
return self._get_value(key)
File ".conda/envs/profile_dists/lib/python3.9/site-packages/pandas/core/series.py", line 1116, in _get_value
loc = self.index.get_loc(label)
File ".conda/envs/profile_dists/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3654, in get_loc
raise KeyError(key) from err
KeyError: 'is_extended'

This can be fixed by adding a check for the keys and setting false if not found. Or ensuring that the keys are inserted into the DF earlier

@jrober84
Copy link
Collaborator Author

jrober84 commented Jun 6, 2024

In extractor update the return condition if no truncated records to set default values of False for the columns used later

    if len(trunc_records) == 0:
        df = df.assign(is_extended=False)
        df = df.assign(is_5p_extended=False)
        df = df.assign(is_3p_extended=False)
        return df

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant