Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub-national regions should link to country #36

Open
lintool opened this issue Aug 22, 2020 · 1 comment
Open

Sub-national regions should link to country #36

lintool opened this issue Aug 22, 2020 · 1 comment

Comments

@lintool
Copy link
Member

lintool commented Aug 22, 2020

When we recognize sub-national regions - e.g., states in the US, provinces in Canada, we should also provide links to the entity id of the country.

@Govind9
Copy link
Collaborator

Govind9 commented Aug 27, 2020

We can add an extra attribute to entities of provinces/states along with the original attribute of wikilink. This will look like:

for ent in doc.ents:
    if ent.label_ in ['US_STATE', 'CANADIAN_PROVINCE']:
        print(f'{ent.text}, type: {ent.label_}, {ent._.wikilink}, parent: {ent._.parent} [{ent.start_char}, {ent.end_char}]')
    else:
        print(f'{ent.text}, type: {ent.label_}, {ent._.wikilink} [{ent.start_char}, {ent.end_char}]')

Or we can just tack on the link after the matching process is done by checking the label type of the entities:

for ent in doc.ents:
    if ent.label_ == 'US_STATE',:
        print(f'{ent.text}, type: {ent.label_}, {ent._.wikilink}, parent: {USA_WIKILINK} [{ent.start_char}, {ent.end_char}]')
    elif ent.label_ == 'CANADIAN_PROVINCE':
        print(f'{ent.text}, type: {ent.label_}, {ent._.wikilink}, parent: {CANADA_WIKILINK} [{ent.start_char}, {ent.end_char}]')
    else:
        print(f'{ent.text}, type: {ent.label_}, {ent._.wikilink} [{ent.start_char}, {ent.end_char}]')

@lintool which option seems better? I think adding an extra attribute is more consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants