Skip to content

Add support for loading intersphinx inventory from file #882

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 26, 2025

Conversation

idclifford
Copy link
Contributor

@idclifford idclifford commented Mar 20, 2025

This is a relatively simple update to support file:// urls in intersphinx. If the python library requests-file is available, file support will be added to the requests session.

Fixes #741
Fixes #881

Copy link
Contributor

@tristanlatr tristanlatr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @idclifford, many thanks for the PR!

I don’t quite understand how this will work since the file path does not carry the information regarding relative to what URL the links should be generated for a given inventory file. Some documentation regarding that new feature would be good, this can be added to docs/source/sphinx-integration.rst.

Also, it would be good to have some test cases that checks:

  • the inventory file loading from command line options
  • the actual links generated when we refer to an object that is defined in a inventory file

Somewhat a generic question, but, is it really worth it to introduce a new dependency just to load a file from disk? Are you using this to load files from nfs shares?

Many Thanks

@tristanlatr
Copy link
Contributor

Hello @idclifford, do you wish to continue working on this feature?

If so, I'de recommend the following approach:

  • Introduce a new repeatable option --intersphinx-file that loads a local inventory file, having the following format: PATH:BASE_URL.
    Where BASE_URL is the base for the generated links, it is mandatory if loading the inventory from a file.

Tell me what you think,

Thanks,

@idclifford
Copy link
Contributor Author

Thanks for at least considering the new feature. I had not planned on more extensive developments, but I understand you're not happy with the very basic addition.
I cannot promise to deliver anything soon, but, sure, I can give your suggestion a go.

@tristanlatr
Copy link
Contributor

It's not that I'm not happy, I'm really happy when new contributors emerges. It's just that I need to see this working, at least have a couple of unit tests. I cannot merge your contribution without tests.

Thanks for your understanding,

@idclifford
Copy link
Contributor Author

I have updated based on your suggestion.
The unit test is in progress.
Could you perhaps suggest a test that I could copy and modify?

@idclifford
Copy link
Contributor Author

I have added some unit tests as requested.

@tristanlatr
Copy link
Contributor

Hello @idclifford, many thanks for the adjustments! This is going in the right direction :)

Here is an example test case that would check the generated links when we load an inventory from a file,

The tests can be added to test_sphinx.py file

def test_generate_then_load_file(tmp_path: Path, 
                                 inv_writer_nolog: sphinx.SphinxInventoryWriter):
    
    from pydoctor.test.test_astbuilder import fromText
    from pydoctor.test.test_epydoc2stan import docstring2html

    # Create a testing inventory file under tmp_path/objects.inv
    src = '''
    class C:
        def __init__(self, a): 
            self.a = a
    '''
    system = fromText(src, modname='mylib').system
    tmp_path.mkdir()
    inv_writer_nolog.generate(system.rootobjects, tmp_path)

    # Then load this inventory file in a new system, include links to elements
    src = '''
    from mylib import C
    class Client:
        "L{C}"
        a: bytes 
        "L{C.a}"
    '''
    system2 = fromText(src, modname='myclient', 
                       systemcls=lambda: model.System(
                           model.Options.from_args(
                               [f'--intersphinx-file={tmp_path / "objects.inv"}']))).system
    
    Client_doc = docstring2html(system2.allobjects['myclient.Client'])
    Client_a_doc = docstring2html(system2.allobjects['myclient.Client.a'])

    assert Client_doc == '<p><a href="???"><p>'
    assert Client_a_doc == '<p><a href="???"><p>'

@idclifford
Copy link
Contributor Author

Documentation is still to come.

Copy link
Contributor

@tristanlatr tristanlatr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for providing unit tests, I have a few questions regarding the usage of the feature and I'de like to read more about it . My principal concern is the fact that it does not seem to be able to generate http link

Comment on lines 792 to 793
assert (root_dir / 'module1.html').samefile(inv_reader_nolog.getLink('some.module1'))
assert (root_dir / 'module2.html').samefile(inv_reader_nolog.getLink('other.module2'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... so if I understand correctly, your generating links that do not include a http scheme but are paths to files directly. Can you tell me more about your usage of this feature ? are you serving all pydoctor generated pages for many projects under the same folder, so these links are all relative in practice ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just pushed the test you requested. It needed some adjustment to work, but I hope it's now okay. i have also updated sphinx-integration.rst as you requested.

I am using pydoctor for local projects that are not hosted online, mostly because the source code includes confidential data and we don't want to upload the source code to github, etc.
These updates to pydoctor were implemented and tested for one project so far. Here, the links are absolute and point to a location on a shared file system. I don't see any reason why they couldn't be relative, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I understand your usage.

Another use case for loading a inventory files from disk is when the API documentation are actually served somewhere (but behind a login page of some sort or in a inaccessible network segment), in that situation pydoctor won't be able to retrieve the inventory file itself from the URL. To accommodate this kind of usage, the --intersphinx-file option could have the following format PATH[:BASE_URL] where the base url is optional, but when used the links generated will be actually HTTP links.

Tell me what you think,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand how this would work in practice. Are you saying that the code documentation is accessible but the inventory file is not. The user would have to manually download the inventory file to the local file system and use this to build the links. The resulting http links would then point to the html pages that are not behind a security page.
If I understand correctly, this is doable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s say the documentation is behind a basic authenticated website. Pydoctor doesn’t support authentication to retrieve intersphinx inventory at the moment. So the solution would be to download objects.inv files withcurl -u user:password https://someprotectedsite/project1/api/objects.inv > project.inv before calling pydoctor with option —intersphinx-file=./project.inv:https://someprotectedsite/project1/api.

Tell me what you think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @idclifford, are you interested in implementing this feature?

If you do not wish to invest more time that this, I understand.

Many thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It should be quite simple. I'll try to do it in the coming weeks. I'm a bit short on free time now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking into this now. The colon : potentially conflicts with the colon in the supplied URL (not necessarily critical) but would conflict with windows drive names. The alternative options would be to use double colon ::, semicolon ; or pipe character |.
Do you have a preference?

Copy link

codecov bot commented Apr 9, 2025

Codecov Report

Attention: Patch coverage is 93.93939% with 2 lines in your changes missing coverage. Please review.

Project coverage is 93.14%. Comparing base (1f146a9) to head (2d30a31).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
pydoctor/options.py 91.30% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master     #882   +/-   ##
=======================================
  Coverage   93.13%   93.14%           
=======================================
  Files          47       47           
  Lines        8702     8732   +30     
  Branches     1597     1600    +3     
=======================================
+ Hits         8105     8133   +28     
- Misses        336      338    +2     
  Partials      261      261           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tristanlatr
Copy link
Contributor

Something is off with the readthedocs build :/

git checkout --force e8dee1baef35ea02e80e6af160a3f8928b15b629 0s Exit: 128
--
1 | fatal: reference is not a tree: e8dee1baef35ea02e80e6af160a3f8928b15b629

git checkout --force e8dee1baef35ea02e80e6af160a3f8928b15b629  
fatal: reference is not a tree: e8dee1baef35ea02e80e6af160a3f8928b15b629

@idclifford
Copy link
Contributor Author

I have no idea why the readthedocs fails.
Could it have something to do with tokens?

@tristanlatr tristanlatr changed the title Add support for file:// URLs for intersphinx Add support for loading intersphinx inventory from file Apr 11, 2025
@idclifford
Copy link
Contributor Author

Implementation, testing and documentation for loading intersphinx inventory from file while specifying a separate base URL is complete.
Currently assuming a double colon :: between the inventory file name and base URL.

@tristanlatr
Copy link
Contributor

Hello @idclifford and many thanks for the updated code. It is looking good. I have one or two nit picks (besides fixing the CI) that I will apply myself on your branch. Give me a couple of weeks and I’ll merge your contribution. Thanks

@idclifford
Copy link
Contributor Author

Thank you!

@tristanlatr tristanlatr linked an issue May 26, 2025 that may be closed by this pull request
@tristanlatr tristanlatr merged commit 6ad0318 into twisted:master May 26, 2025
40 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for file:// URLs for intersphinx Intersphinx from local folder
2 participants