Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XInclude by xml:id and automatic <xi:fallback/> #187

Closed
wants to merge 6 commits into from

Conversation

alfsb
Copy link
Member

@alfsb alfsb commented Nov 23, 2024

This PR includes a fixup step on configure.php XInclude processing, to the effect to simulating XML Inclusions Version 1.1.

With this PR, an doc-en manual with this changes builds ok:

diff --git a/reference/filesystem/functions/fgetcsv.xml b/reference/filesystem/functions/fgetcsv.xml
index 279785ed00..66a2fda3c1 100644
--- a/reference/filesystem/functions/fgetcsv.xml
+++ b/reference/filesystem/functions/fgetcsv.xml
@@ -65 +65 @@
-    <varlistentry>
+    <varlistentry xml:id="xi..function.fgetcsv..parameters..separator">
@@ -73 +73 @@
-    <varlistentry>
+    <varlistentry xml:id="xi..function.fgetcsv..parameters..enclosure">
@@ -81 +81 @@
-    <varlistentry>
+    <varlistentry xml:id="xi..function.fgetcsv..parameters..escape">
diff --git a/reference/filesystem/functions/fputcsv.xml b/reference/filesystem/functions/fputcsv.xml
index f3a94fb1b1..edd1f42f8b 100644
--- a/reference/filesystem/functions/fputcsv.xml
+++ b/reference/filesystem/functions/fputcsv.xml
@@ -46,9 +46,9 @@
-    <xi:include xpointer="xmlns(db=http://docbook.org/ns/docbook) xpointer(id('function.fgetcsv')/db:refsect1[@role='parameters']//db:varlistentry[db:term[db:parameter[text()='separator']]]/.)">
-     <xi:fallback/>
-    </xi:include>
-    <xi:include xpointer="xmlns(db=http://docbook.org/ns/docbook) xpointer(id('function.fgetcsv')/db:refsect1[@role='parameters']//db:varlistentry[db:term[db:parameter[text()='enclosure']]]/.)">
-     <xi:fallback/>
-    </xi:include>
-    <xi:include xpointer="xmlns(db=http://docbook.org/ns/docbook) xpointer(id('function.fgetcsv')/db:refsect1[@role='parameters']//db:varlistentry[db:term[db:parameter[text()='escape']]]/.)">
-     <xi:fallback/>
-    </xi:include>
+   <xi:include xpointer="xi..function.fgetcsv..parameters..separator">
+    <xi:fallback/>
+   </xi:include>
+   <xi:include xpointer="xi..function.fgetcsv..parameters..enclosure">
+    <xi:fallback/>
+   </xi:include>
+   <xi:include xpointer="xi..function.fgetcsv..parameters..escape">
+    <xi:fallback/>
+   </xi:include>
@@ -73 +73 @@

The xi prefix or .. separators are not special. They are only a suggestion for a convention for these structurally incorrect xml:ids, but the code works perfectly ok with any xml:id.

This will also enable putting some of language-snippets.ent entities that are tightly related to an extension, and XIncluding these "hosted" entities directly from their extension related xml:id.

Fun fact: The PHP manual has ~25.000 IDs at XInclude processing step. Inspecting they two times in a row takes almost no time.

Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhhh I quite like this!

@nielsdos could you review this as the DOM/XML expert :D (also pinging @haszi as we have found this very annoying in the past)

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me.
Note that there's the occasional xinclide typo though

@alfsb
Copy link
Member Author

alfsb commented Nov 24, 2024

Typo fixed, also fixed an error condition. Two follow ups:

Suggested convention for XInclude xml:ids: {parent_id}..{local_xinclude_id}

I was experimenting with some reserved syntax for xml:id of includes, but as xml:id are NCName, their contents are severely restricted. I was worried about classes between structural ids, auto generated file ids and the new XInclude ids. But after a while, I got that any current id "reserves" a namespace of sorts, so using a convention like this will avoid collisions.

So in the examples above, the xml:ids created for XInclusion would be like function.fgetcsv..parameters..separator. This avoids collisions and also has a nice property of indicating where the referenced id is being defined, if some XIncludes fail (to help translators).

Suggested convention for XInclude xi:fallback: <xi:fallback><!--doc-en-fatal--></xi:fallback>

Speaking of failures, the other follow up is that <xi:fallback> presence effectively disables detecting XInclude failures. Tolerating XInclude failbacks is important for translations, but is a hard error on doc-en. So I propose another convention. On XInclude call sites:

<xi:include xpointer="function.fgetcsv..parameters..separator">
 <xi:fallback><!--doc-en-fatal--></xi:fallback>
</xi:include>

So that in the XInclude fixups, it would be possible to detect and report XInclude failures, and also make it fatal again on doc-en.

Instead of <!--doc-en-fatal-->, other suggestion is <!--xi:fallback-->, as no "xi:" is expect to survive $dom->xindluce().

@alfsb
Copy link
Member Author

alfsb commented Nov 24, 2024

Scratch the thing about doc-en-fatal. The last commit added automatic xi:fallback for all manual!

  • <xi:fallback/> can be removed completely
  • failures on doc-en are hard errors
  • failures on translations are soft errors
  • detailed reporting

For example, an XInclude like this:

<xi:include xpointer="nothing"/>

Is reported as:

Failed xi:include at:        /set(index)/book(appendices)/appendix(ini)/section(ini.list)/para/table/tgroup/tbody
Failed xi:include target is: nothing

Copy link
Contributor

@haszi haszi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good. This functionality is badly needed to avoid all that duplicated content we still have in doc-en.

@alfsb
Copy link
Member Author

alfsb commented Nov 25, 2024

xi:fallback/ can be removed completely

I tested this specially. Mass deleted <xi:fallback/> on doc-en, and built all languages. No new regressions (ro andit are effectively dead, es got a comeback).

I think this PR is ready to merge, but as I have (many) other PRs, I will keep merging at a slow pace, to avoid mixing the issues they address. Editors are welcome to merge this, to accelerate the new simpler XInclude by xml:id, automatic xi:fallback/ usage.

@alfsb alfsb changed the title XInclude fixup to enable using xml:id targets XInclude by xml:id and automatic <xi:fallback/> Nov 25, 2024
@alfsb
Copy link
Member Author

alfsb commented Dec 2, 2024

Delaying this, waiting for #194 . Also, let me use this to write about what I tried so far:

  • To take a list of DOMElements before xinclude(). This does not work, as "unchanged" DOMElement does not compare equal before and after xinclude().
  • Making a list of xml:id and the xpaths inside the document. This also does not work, as all xpaths can change because of one xinclude().
  • Placing annotations at target's parents, so it would be possible to identify newer xml:ids without marked parentes. This will fail with recursive XInclude (PR 194).

There is now so much code that was written around libxml missing XInclude 1.1, that now may be the time to write code about XInclude 1.1 itself. That is, to implement the entire XInclude in pure PHP. This may sound ludicrous (as it is), and yet may be the only viable and stable solution, instead waiting for libxml Xinclude 1.1 support that is not even planned.

The next push will change implementation to the third item above (use annotations at target's parent). It also adds code removing duplicated IDs after XInclude, that may be the result of mixed XPath/xml:id xpointer usage, and now, recursive XInclude usage. it still remains an imperfect solution for the problem, but will enable start using xml;Id pointers, instead of xpaths.

This all makes me concludes that ID schema for XInclude (that is, structural_id .. local.path) has a new merit: of making sure (by convention) that no structural xml:id is used in XIncludes, and any "random" deleted structural xml:id is a bug, in the absence of a perfect solution.

@alfsb
Copy link
Member Author

alfsb commented Dec 3, 2024

This got conflicted after PR 194, as expected. PR 194 was a quick fix, PR 196 completes it, and further conflicts here. Instead of cherry picking this (now) big change set, I will do a start over after PR 196.

A prototype for XInclude 1.1 by xml:id in pure PHP code was successful and way simpler/smaller than this PR. I will try this alternative in the following week, after merging 196. Thanks so much for your reviews, sorry for the noise. Backing up to sequential, semanal changes.

@alfsb alfsb closed this Dec 3, 2024
@alfsb alfsb deleted the xinclude branch December 3, 2024 14:03
@alfsb alfsb mentioned this pull request Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants