mzattera
diff --git a/‎README.md
Lines changed: 5 additions & 0 deletions b/‎README.md
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/005/images/Gallows.PNG
6.73 KB b/‎docs/005/images/Gallows.PNG
6.73 KB
diff --git a/‎docs/005/index.md
Lines changed: 4 additions & 3 deletions b/‎docs/005/index.md
Lines changed: 4 additions & 3 deletions
diff --git a/‎docs/006/index.md
Lines changed: 25 additions & 17 deletions b/‎docs/006/index.md
Lines changed: 25 additions & 17 deletions
diff --git a/‎docs/007/index.md
Lines changed: 19 additions & 8 deletions b/‎docs/007/index.md
Lines changed: 19 additions & 8 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j-apps/.classpath
Lines changed: 1 addition & 1 deletion b/‎eclipse/io.github.mzattera.v4j-apps/.classpath
Lines changed: 1 addition & 1 deletion
diff --git a/‎eclipse/io.github.mzattera.v4j-apps/.settings/org.eclipse.jdt.core.prefs
Lines changed: 3 additions & 3 deletions b/‎eclipse/io.github.mzattera.v4j-apps/.settings/org.eclipse.jdt.core.prefs
Lines changed: 3 additions & 3 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j-apps/pom.xml
Lines changed: 2 additions & 2 deletions b/‎eclipse/io.github.mzattera.v4j-apps/pom.xml
Lines changed: 2 additions & 2 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j.cmc/.antlr-generator-3.2.0-patch.jar
1.42 MB b/‎eclipse/io.github.mzattera.v4j.cmc/.antlr-generator-3.2.0-patch.jar
1.42 MB
diff --git a/‎eclipse/io.github.mzattera.v4j.cmc/.classpath
Lines changed: 9 additions & 0 deletions b/‎eclipse/io.github.mzattera.v4j.cmc/.classpath
Lines changed: 9 additions & 0 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j.cmc/.gitignore
Lines changed: 1 addition & 0 deletions b/‎eclipse/io.github.mzattera.v4j.cmc/.gitignore
Lines changed: 1 addition & 0 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j.cmc/.launch/Generate CmcCounter (grx) Language Infrastructure.launch
Lines changed: 22 additions & 0 deletions b/‎eclipse/io.github.mzattera.v4j.cmc/.launch/Generate CmcCounter (grx) Language Infrastructure.launch
Lines changed: 22 additions & 0 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j.cmc/.project
Lines changed: 34 additions & 0 deletions b/‎eclipse/io.github.mzattera.v4j.cmc/.project
Lines changed: 34 additions & 0 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j.cmc/.settings/org.eclipse.core.resources.prefs
Lines changed: 2 additions & 0 deletions b/‎eclipse/io.github.mzattera.v4j.cmc/.settings/org.eclipse.core.resources.prefs
Lines changed: 2 additions & 0 deletions
diff --git a/‎eclipse/io.github.mzattera.v4j.cmc/.settings/org.eclipse.jdt.core.prefs
Lines changed: 10 additions & 0 deletions b/‎eclipse/io.github.mzattera.v4j.cmc/.settings/org.eclipse.jdt.core.prefs
Lines changed: 10 additions & 0 deletions
@@ -233,6 +233,11 @@ This package contains some "utility" classes to deal with files, math, etc.
 Please take a look what is in here before implementing anythign from scratch.
 
 
+### `io.github.mzattera.v4j.cmc`
+
+This is a [Xtext](https://www.eclipse.org/Xtext/) project created for [Note 006](https://mzattera.github.io/v4j/006/); please refer to it for more details.
+
+
 ### Testing
 
 Project `io.github.mzattera.v4j-apps` contains JUnit tests for the v4j library and (some) of the "applications" in `v4j-apps`.
 
@@ -1,6 +1,6 @@
 # Note 005 - Slots and a New Alphabet
 
-_Last updated Jan. 9th, 2022._
+_Last updated Jan. 10th, 2022._
 
 _This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
 **links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
@@ -124,6 +124,7 @@ appearing elsewhere in the text. The remaining cases (2 out of 100) are mostly w
 
 The below table shows occurrence of glyphs in slots for regular terms [{2}](#Note2).
 
+<a id="GliphCountImg" />
 ![Table with glyph count by slot.](images/Char Count by Slot.PNG)
 
 As expected, the distribution of glyphs in slots varies based on Currier language and illustration:
@@ -146,7 +147,7 @@ Below, I analyze more in detail some relationships between glyphs, as they appea
 
 #### Rare Characters
 
-Some EVA characters appears in the original interlinear transliteration very seldom[{3}](#Note3), end even less frequently in the concordance version used, 
+Some EVA characters seldom appear in the original interlinear transliteration[{3}](#Note3), end even less frequently in the concordance version used, 
 where they appear mostly as single characters, as shown in the table below (which also considers "unreadable" tokens).
 For this reason, I decided to ignore these characters and mark them as "unreadable character" for this analysis.
 
@@ -176,7 +177,7 @@ However:
 
 This leads me to think pedestalled gallows are Voynich characters in their own, and not ligatures.
 
-In addition, the character 'c' appears outside of the pedestal or pedestalled gallows only 7 times ('c', 'oc','chcpar', 'ckshy', 'ocfshy', 'cs?t?eey', and 'o?cs'); similarly, the character 'h' appears outside of the pedestal, the pedestalled gallows or the "plumed" pedestal ('sh') only 4 times ('theody', 'docfhhy', 'cfhhy', adn 'd?ithy'). This seems a strong indication that EVA 'c' and 'h' do not correspond to Voynich characters[{4}](#Note4)[{5}](#Note5). 
+In addition, the character 'c' appears outside of the pedestal or pedestalled gallows only 7 times ('c', 'oc', 'chcpar', 'ckshy', 'ocfshy', 'cs?t?eey', and 'o?cs'); similarly, the character 'h' appears outside of the pedestal, the pedestalled gallows or the "plumed" pedestal only 4 times ('theody', 'docfhhy', 'cfhhy', and 'd?ithy'). This seems a strong indication that EVA 'c' and 'h' do not correspond to Voynich characters[{4}](#Note4)[{5}](#Note5). 
 
 
 #### 'e' and 'i'
 
@@ -1,8 +1,8 @@
 # Note 006 - Works on Word Structure
 
-_Last updated Jan. 9th, 2022._
+_Last updated Jan. 16th, 2022._
 
-_This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
+_This note refers to [release v.7.0.0](https://github.com/mzattera/v4j/tree/v.7.0.0) of v4j;
 **links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
 In addition, some of this note content might have become obsolete in more recent versions of the library._
 
@@ -19,7 +19,6 @@ _Please refer to the [home page](..) for a set of definitions that might be rele
 In this page I will list, review, and comment works from different authors about the inner structure of Voynich words.
 When appropriate, I will compare their findings with my [slots concept](../005).
 
-I expect these notes to grow and refine over time (as for the others, to be honest).
 Number in square brackets indicate the date when corresponding works were published (as far as I can determine it).
 
 
@@ -88,25 +87,32 @@ Similarly, it can be seen that gallows in slots 3 and 7, which belong to the cor
 
 Stolfi notes: "_When designing the grammar, we tried to strike a useful balance between a simple and informative model and one that would cover as much of the corpus as possible. ... Conversely, the grammar is probably too permissive in many points, so that many words that it classifies as normal are in fact errors or non-word constructs_". It should be noted that the grammar is really good in parsing Voynichese
 (accordingly to Solfi it covers "_over 96.5% of all the tokens (word instances) in the text_") but,
-on the other side, it is also very bad in recognizing what is not Voynichese; the grammar accepts something in the order of 1.4e20 (100 billions of billions) different terms, only about 4'500 of which are terms in the manuscript ([concordance version](https://github.com/mzattera/v4j#ivtff)). Just for comparison, all the words that can be generated by the slot model amount at a total of 16'753'291 (13 order of magnitude less) of which around 2'800 are Voynich terms; the model covers slightly more than 88% of tokens (98% considering separable terms) but it is much easier to describe and understand.
+on the other side, it is also very bad in recognizing what is not Voynichese; the grammar accepts something in the order of 1.4e20 (100 billions of billions) different terms[{1}](#Note1), only about 4'500 of which are terms in the manuscript ([concordance version](https://github.com/mzattera/v4j#ivtff)). Just for comparison, all the words that can be generated by the slot model amount at a total of 16'753'291 (13 order of magnitude less) of which around 2'800 are Voynich terms; the model covers slightly more than 88% of tokens (98% considering separable terms) but it is much easier to describe and understand.
 
-In summary, I do agree with Stolfi (and other authors) that the order in which characters appears in Voynich
-words is not arbitrary, but I think his model is misleading in suggesting a "layered" structure; for example,
-word prefixes and suffixes, which in Solfi's model both belong to the same layer (the crust), are indeed very different and assigning them to the same word structure looks completely arbitrary; ultimately, it seems
-the grammar suggests a "sequence" of possible characters, rather than a "onion-like" structure for words.
-If this is the case, it must be said the other, much simpler, models in this page show the same overall 
-structure of Voynich words even if in less details or with less coverage of Voynich terms.
-Regarding the fine details, these might not be as relevant as Stolfi admits that "_one should not give too much weight to the finer divisions and associations implied by our parse trees_". It should also be mentioned that the grammar
-looks unnecessary complex, mostly because of the way it handles "circles"; this makes very difficult to grasp the structure of Voynichese below the most superficial levels by looking at the grammar. This is further complicated by the fact that the huge majority of the words the grammar describes are clearly very different by those we found in the text.
+In summary:
+  
+  - I do agree with Stolfi (and other authors) that the order in which characters appears in Voynich
+words is not arbitrary.
+
+  - From a high-level structure point of view, the "layered" structure Stolfi proposes is debatable. For example,
+word prefixes and suffixes, which in Solfi's model both belong to the same layer (the crust), are indeed very different and assigning them to the same word structure looks completely arbitrary. Similarly, divisions between mantle and core are unclear.
 
+  - From a fine-grain structure point of view, the grammar gets complex in an attempt to parse "circles" and 'e' sequence accordingly to 
+  rules, that, Stolfi agrees are arbitrary; he also adds: "_one should not give too much weight to the finer divisions and associations 
+  implied by our parse trees_".
+  
+  - The grammar has a very good coverage of the Voynich vocabulary but it does it at the expenses of selectivity. In other words, 
+the grammar recognizes a huge amount of words that are not in the text and that clearly do not look like 
+Voynichese (e.g. words starting with 'yqaynoyx-').
 
 
-# Philip Neal [?]
+
+# Philip Neal 
 
 After writing [working note 005](../005), I realized Philip Neal published a [very similar concept](http://philipneal.net/voynichsources/transcription_neva_spaced/);
 his point was that this could be the result of using a grille to produce the text, something similar to the more complete approach described in [RUGG (2004)](../biblio.md).
 
-Neal confirmed [{1}](#Note1) that his grid scheme (which pre-dates my notes) is much the same as my slot concept and,
+Neal confirmed [{2}](#Note2) that his grid scheme (which pre-dates my notes) is much the same as my slot concept and,
 though he only put f103r on his website, he has analyzed every page of the manuscript along the same lines.
 
 He also mentions he knows of at least another researcher coming to similar conclusions independently.
@@ -147,7 +153,7 @@ So NEVA and the Slot alphabet have different objectives, as my proposal aims at
 "fine distinctions" in Glen Claston's Voynich 101.
 
 
-# Sean B. Palmer [2004?]
+# Sean B. Palmer [2004]
 
 I found the below grammar attributed to Palmer by [Pelling](http://ciphermysteries.com/2010/11/22/sean-palmers-voynichese-word-generator) (see also below):
 
@@ -170,7 +176,7 @@ O = o
 Accordingly to Pelling, Palmer claims this grammar can generate 97% of Voynichese words, but this is clearly (as Pelling says) because it generates a lot of words (potentially infinite strictly looking at the grammar).
 
 
-# Elmar Vogt [2009?]
+# Elmar Vogt [2009]
 
 Created a [grammar](https://voynichthoughts.wordpress.com/grammar/) for existing words in the Voynich Manuscript by analyzing the 
 stars section of the Voynich, which is written in Currier's B language.  
@@ -193,7 +199,9 @@ This pattern is fundamentally based on shapes of individual glyphs but also info
 
 **Notes**
 
-<a id="Note1">**{1}**</a> Personal communication, October 2021.
+<a id="Note1">**{1}**</a> Project [io.github.mzattera.v4j.cmc]() is a [Xtext](https://www.eclipse.org/Xtext/) project with a simple grammar that reads Stolfi's grammar and counts the number of terms it accepts. There is also some code to alternatively generate these terms in various way. Keep in mind, the number of these terms is huge. A local version of Stolfi's grammar suitable for parsing can be found [here]().
+
+<a id="Note2">**{2}**</a> Personal communication, October 2021.
 
 
 ---
 
@@ -1,8 +1,8 @@
 # Note 007 - A Graph View on Word Structure
 
-_Last updated Oct. 24th, 2021._
+_Last updated Jan. 13th, 2021._
 
-_This note refers to [release v.5.0.0](https://github.com/mzattera/v4j/tree/v.5.0.0) of v4j;
+_This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
 **links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
 In addition, some of this note content might have become obsolete in more recent versions of the library._
 
@@ -17,34 +17,34 @@ _Please refer to the [home page](..) for a set of definitions that might be rele
 
 
 # Abstract
+**At this stage, this note is a placeholder of a work in progress I should finalize ASAP.**
+**IMAGES AND RELATIVE COMMENTS MUST BE REFRESHED AND VALIDATED**
 
 
 # Methodology
 
 This work builds on my [slot model](../005) for Voynich words. 
-** Unless differently noted, this pages uses the Slot alphabet to transliterate Voynich words. **
+**Unless differently noted, this pages uses the Slot alphabet to transliterate Voynich words.**
 
-I created a graph where nodes are charters in their slots; e.g. "1_o" represent character 'o' in slot number 1. 
+I created a graph[{1}](#Note1) where nodes are charters in their slots; e.g. "1_o" represent character 'o' in slot number 1. 
 
 After that, I connected node A with node B if there is a regular term in the Voynich where character B follows directly character A;
 the connection is a directed edge with a weight equal the number of terms where the characters are connected.
-For visualization purposes I remove all edges with a weight less than 10.
+For visualization purposes I remove all edges with a weight less than 10[{2}](#Note2). 
 
 Final note, when possible I push characters to the rightmost available slot.
 
 The resulting graph is shown below and commented further.
 
 ![Complete word structure graph.](images/Complete.PNG)
 
-** To see the pictures properly, right click on them and open them in a different tab. **
+**To see the pictures properly, right click on them and open them in a different tab.**
 
 
 # Analysis
 
 Here i analyze char connections slot by slot.
 
-LE IMMAGINI SONO DA RIFARE (E ANCHE QUALCHE CONCLUSIONE)
-
 ## Slot 0
 
 Characters in slot 0 behave quite different one another.
@@ -167,6 +167,17 @@ Noticeable difference is that, while 'l' and 'r' can be followed by the word fin
 This slot contains the word ending 'y' alone.
 
 ![11_d](images/11_d.PNG)
+	
+---
+
+**Notes**
+
+<a id="Note1">**{1}**</a> Class [`io.github.mzattera.v4j.applications.slot.BuildSlotStateMachine`]() was used to generate the graph,
+that was then visualized using [Gephi](https://gephi.org/).
+
+<a id="Note2">**{2}**</a> Please notice that, as you can see 
+from the [glyph count by slot](../005/#GliphCountImg), some glyphs appear in less than 1% of the terms, that means they will
+have less than 28 total incoming connection, therefore they might look unconnected in this graph.
 
 
 ---
 
@@ -18,7 +18,7 @@
 			<attribute name="maven.pomderived" value="true"/>
 		</attributes>
 	</classpathentry>
-	<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-15">
+	<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-11">
 		<attributes>
 			<attribute name="maven.pomderived" value="true"/>
 		</attributes>
 
@@ -1,9 +1,9 @@
 eclipse.preferences.version=1
 org.eclipse.jdt.core.compiler.codegen.inlineJsrBytecode=enabled
 org.eclipse.jdt.core.compiler.codegen.methodParameters=do not generate
-org.eclipse.jdt.core.compiler.codegen.targetPlatform=15
+org.eclipse.jdt.core.compiler.codegen.targetPlatform=11
 org.eclipse.jdt.core.compiler.codegen.unusedLocal=preserve
-org.eclipse.jdt.core.compiler.compliance=15
+org.eclipse.jdt.core.compiler.compliance=11
 org.eclipse.jdt.core.compiler.debug.lineNumber=generate
 org.eclipse.jdt.core.compiler.debug.localVariable=generate
 org.eclipse.jdt.core.compiler.debug.sourceFile=generate
@@ -13,4 +13,4 @@ org.eclipse.jdt.core.compiler.problem.enumIdentifier=error
 org.eclipse.jdt.core.compiler.problem.forbiddenReference=warning
 org.eclipse.jdt.core.compiler.problem.reportPreviewFeatures=warning
 org.eclipse.jdt.core.compiler.release=disabled
-org.eclipse.jdt.core.compiler.source=15
+org.eclipse.jdt.core.compiler.source=11
@@ -9,7 +9,7 @@
 	<description>"Applications" using v4j Java to play with the Voynich manuscript.</description>
 	<properties>
 		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
-		<maven.compiler.source>15</maven.compiler.source>
-		<maven.compiler.target>15</maven.compiler.target>
+		<maven.compiler.source>11</maven.compiler.source>
+		<maven.compiler.target>11</maven.compiler.target>
 	</properties>
 </project>
@@ -0,0 +1,9 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<classpath>
+	<classpathentry kind="src" path="src"/>
+	<classpathentry kind="src" path="src-gen"/>
+	<classpathentry kind="src" path="xtend-gen"/>
+	<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-11"/>
+	<classpathentry kind="con" path="org.eclipse.pde.core.requiredPlugins"/>
+	<classpathentry kind="output" path="bin"/>
+</classpath>
@@ -0,0 +1 @@
+/bin/
@@ -0,0 +1,22 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<launchConfiguration type="org.eclipse.emf.mwe2.launch.Mwe2LaunchConfigurationType">
+    <stringAttribute key="org.eclipse.debug.core.ATTR_REFRESH_SCOPE" value="${working_set:&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;resources&gt;&#13;&#10;&lt;item path=&quot;/io.github.mzattera.v4j.cmc.count&quot; type=&quot;4&quot;/&gt;&#13;&#10;&lt;/resources&gt;}"/>
+    <listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
+        <listEntry value="/io.github.mzattera.v4j.cmc"/>
+    </listAttribute>
+    <listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
+        <listEntry value="4"/>
+    </listAttribute>
+    <listAttribute key="org.eclipse.debug.ui.favoriteGroups">
+        <listEntry value="org.eclipse.debug.ui.launchGroup.debug"/>
+        <listEntry value="org.eclipse.debug.ui.launchGroup.run"/>
+    </listAttribute>
+    <booleanAttribute key="org.eclipse.jdt.launching.ATTR_ATTR_USE_ARGFILE" value="false"/>
+    <booleanAttribute key="org.eclipse.jdt.launching.ATTR_SHOW_CODEDETAILS_IN_EXCEPTION_MESSAGES" value="true"/>
+    <booleanAttribute key="org.eclipse.jdt.launching.ATTR_USE_CLASSPATH_ONLY_JAR" value="false"/>
+    <stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="org.eclipse.emf.mwe2.launch.runtime.Mwe2Launcher"/>
+    <stringAttribute key="org.eclipse.jdt.launching.MODULE_NAME" value="io.github.mzattera.v4j.cmc.count"/>
+    <stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="io.github.mzattera.v4j.cmc.count.GenerateCmcCounter"/>
+    <stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="io.github.mzattera.v4j.cmc"/>
+    <stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-Xmx512m"/>
+</launchConfiguration>
@@ -0,0 +1,34 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<projectDescription>
+	<name>io.github.mzattera.v4j.cmc</name>
+	<comment></comment>
+	<projects>
+	</projects>
+	<buildSpec>
+		<buildCommand>
+			<name>org.eclipse.xtext.ui.shared.xtextBuilder</name>
+			<arguments>
+			</arguments>
+		</buildCommand>
+		<buildCommand>
+			<name>org.eclipse.jdt.core.javabuilder</name>
+			<arguments>
+			</arguments>
+		</buildCommand>
+		<buildCommand>
+			<name>org.eclipse.pde.ManifestBuilder</name>
+			<arguments>
+			</arguments>
+		</buildCommand>
+		<buildCommand>
+			<name>org.eclipse.pde.SchemaBuilder</name>
+			<arguments>
+			</arguments>
+		</buildCommand>
+	</buildSpec>
+	<natures>
+		<nature>org.eclipse.xtext.ui.shared.xtextNature</nature>
+		<nature>org.eclipse.jdt.core.javanature</nature>
+		<nature>org.eclipse.pde.PluginNature</nature>
+	</natures>
+</projectDescription>
@@ -0,0 +1,2 @@
+eclipse.preferences.version=1
+encoding/<project>=UTF-8
@@ -0,0 +1,10 @@
+eclipse.preferences.version=1
+org.eclipse.jdt.core.compiler.codegen.inlineJsrBytecode=enabled
+org.eclipse.jdt.core.compiler.codegen.targetPlatform=11
+org.eclipse.jdt.core.compiler.compliance=11
+org.eclipse.jdt.core.compiler.problem.assertIdentifier=error
+org.eclipse.jdt.core.compiler.problem.enablePreviewFeatures=disabled
+org.eclipse.jdt.core.compiler.problem.enumIdentifier=error
+org.eclipse.jdt.core.compiler.problem.reportPreviewFeatures=warning
+org.eclipse.jdt.core.compiler.release=enabled
+org.eclipse.jdt.core.compiler.source=11
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+eclipse.preferences.version=1`
	`2`	`+encoding/<project>=UTF-8`