Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per-volume dates to every volume of every edition of every reporter #19

Open
mlissner opened this issue Mar 26, 2020 · 5 comments
Open
Assignees

Comments

@mlissner
Copy link
Member

One example could be something like:

{ 
"A.": [
        {
            "cite_type": "state_regional",
            "editions": {
                "A.": {
                    "end": "1938-12-31T00:00:00",
                    "start": "1885-01-01T00:00:00",
                    "volumes: {
                        1:  {"start": "1885-01-01T00.00.00", "end": 1885-06-01T00.00.00},
                        2:  {"start": "1885-01-01T00.00.00", "end": 1885-06-01T00.00.00}
                    }
                },
                "A.2d": {
                    "end": "2010-12-31T00:00:00",
                    "start": "1938-01-01T00:00:00"
                },
                "A.3d": {
                    "end": null,
                    "start": "2010-01-01T00:00:00"
                }
            }
        }
    ]
}

But that'd create a monster of a JSON file.

@mlissner
Copy link
Member Author

I forgot to mention why these are useful. In https://github.com/freelawproject/courtlistener/issues/299, we've identified that we want to start finding citations that lack page numbers, like, 442 U.S. ___. If we want to do that, we won't be able to rely on the citation to look them up and instead we'll have the volume number, reporter abbreviation, and if we're lucky, some of the party info.

That means that the party info is the only unique thing we've got, so if we're going to use that, being able to refine by volume date would really help reduce false positives.

@brianwc
Copy link

brianwc commented Mar 27, 2020 via email

@mlissner
Copy link
Member Author

That's a really good point, Brian. There's no point in doing what this issue proposes, at least not for the purpose we were contemplating. Thanks.

@mlissner
Copy link
Member Author

mlissner commented Apr 6, 2020

Note that in #21, @jcushman points out that non-numeric volume numbers are thing, so the above format would have some limitations.

@jcushman
Copy link
Contributor

jcushman commented Apr 7, 2020

Couple more thoughts:

  • If this bug supersedes Add valid volume number values? #21, then the list of volumes should always be exhaustive; one can't add some volumes for an edition without adding all of them. To be precise, if an edition "end" date is provided, and a "volumes" dict is provided, then all valid volumes should be provided so users can assume that not-found volume keys are invalid citations.
  • Ordering: "volumes" should probably reflect the order of volumes, one way or another. Does it work for it to be an unordered dictionary instead of a list? I think maybe it does -- a user who wanted to see volumes in order could natsort the keys, which works on all the non-numeric volume numbers in CAP, or sort based on end date, which seems like it ought to work. If neither of those is satisfying, though, then maybe "volumes" should be a list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants