Skip to content
This repository was archived by the owner on Nov 22, 2024. It is now read-only.

Make it deal with new OPDS catalog #2

Open
kelson42 opened this issue Dec 25, 2018 · 10 comments
Open

Make it deal with new OPDS catalog #2

kelson42 opened this issue Dec 25, 2018 · 10 comments

Comments

@kelson42
Copy link
Contributor

Kiwix published now its catalog in OPDS format, see https://wiki.kiwix.org/wiki/OPDS

@orblivion
Copy link
Owner

orblivion commented Dec 25, 2018 via email

@orblivion
Copy link
Owner

At very least, I can try to call on this OPDS feed when I generate the package so that the content downloading step of the onboarding process has more up-to-date content to choose from. This means the catalog would be hard-coded into the package, so I would have to release a new version to update the catalog, but it's at least an improvement over the current situation (where I sort of manually picked out stuff).

I'll look more closely at this when I get around to it, but if this has some sort of web page generating facility that makes it look like library.kiwix.org, that would be even better. Assuming I could put some chrome around it to continue the onboarding process, I'd probably want to put that in the package as well.

@kelson42
Copy link
Contributor Author

We have started to push library.kiwix.org, so far mostly as a demonstration of the catalog, but we would like to improve the welcome page with better filtering/search capabilities and also providing a download link and widgets... All kind of things which would help you to build easily what you aim for.

@kelson42
Copy link
Contributor Author

kelson42 commented Jan 5, 2023

library.kiwix.org and its OPDS stream are both stable meanwhile.

@orblivion
Copy link
Owner

orblivion commented Jan 5, 2023

Hi @kelson42 I was just thinking about this. I'd love to update Kiwix Serve, especially now that it's back in Debian, way easier to package. And I'd love to have your library downloading interface and get rid of my verbose home grown setup interface.

The problem that (I assume) will stop me in my tracks is that Sandstorm has security features that proxy all network connections. It proxies outgoing connections to stop malicious apps from "phoning home", but it gives the user a popup to explicitly allow it.

The problem is that the proxy is rather conservative about what it allows through. This may be for extra security in some way but I'm actually not sure about it. The specific problem I've run into is that it has a maximum request size, and it doesn't support range requests. This means that it can't download large files. Actually, the proxy for incoming requests (from the browser) has the same restriction. Part of my home grown setup interface for Kiwix on Sandstorm, where the user uploads zim files to their Kiwix grain, was to work around this by using POST variables instead of the usual header. This won't work for the Kiwix backend downloader (Unless you want to hack library.kiwix.org just for this! In which case we could do it. Though a simpler method may be to split the download into small chunks and have the downloader assemble them. I did this with my WIP OpenStreetMap app.).

But as I understand, range requests are only an issue because they haven't gotten around to whitelisting them yet. At least that's what I remember the incoming requests. @zenhack I don't suppose outgoing range requests are anywhere in the near priorities?

@orblivion
Copy link
Owner

(you may want to pull up the Github page rather than just reading the email; I made a few clarifying edits)

@kelson42
Copy link
Contributor Author

kelson42 commented Jan 5, 2023

@orblivion Thank you for the update. I see here this is mostly a thing on your side (with Sandstorm) for both the catalogue and the ZIM file download itself. For the catalogue part, you can now:

  • Ask the OPDS API (and develop your own UI)
  • Use directly library.kiwix.org UI (in an iframe for example)
  • We have a project of widget

Let me know if something is unclear.

@orblivion
Copy link
Owner

Sandstorm wouldn't let me embed an external page (again, security hardening). But I bet all of the API endpoints other than zim downloading would work just fine from the back end, since the responses would be small.

So, I could do a medium term fix. Right now my onboarding interface links to a few specific zim files as examples, and to the old catalog page for everything else. I could replace all of that with an interface that lets you search the live library, and read descriptions, and see thumbnails and all of that. But from that point it would have to work the same as now: the user would get presented with a download link they'd have to click, download the zim to their browser, and upload back to their Sandstorm grain.

Though, maybe I could let users download small zim files through the API, fwiw.

Long term fix would be downloading via the API once the range request thing is figured out.

@zenhack
Copy link

zenhack commented Jan 8, 2023

@orblivion, range requests aren't really on my list, no -- I'd be willing to advise if someone else wanted to add them. Step 1 would be to extend the schema in web-session.capnp with the necessary fields. The actual implementation for outgoing is the ExternalWebSession class in shell/imports/server/drivers/external-ui-view.js. We might also need to make changes to sandstorm-http-bridge to plumb the necessary fields through, unless you want to speak capnp directly.

Also a heads up if you haven't seen it re my own activity: https://zenhack.net/2023/01/06/introducing-tempest.html

@orblivion
Copy link
Owner

orblivion commented Jan 8, 2023 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants