Skip to content

Commit 1ceca89

Browse files
committed
Ready for v.7.0
1 parent 7361315 commit 1ceca89

File tree

64 files changed

+9059
-40
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+9059
-40
lines changed

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,6 +233,11 @@ This package contains some "utility" classes to deal with files, math, etc.
233233
Please take a look what is in here before implementing anythign from scratch.
234234

235235

236+
### `io.github.mzattera.v4j.cmc`
237+
238+
This is a [Xtext](https://www.eclipse.org/Xtext/) project created for [Note 006](https://mzattera.github.io/v4j/006/); please refer to it for more details.
239+
240+
236241
### Testing
237242

238243
Project `io.github.mzattera.v4j-apps` contains JUnit tests for the v4j library and (some) of the "applications" in `v4j-apps`.

docs/005/images/Gallows.PNG

6.73 KB
Loading

docs/005/index.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Note 005 - Slots and a New Alphabet
22

3-
_Last updated Jan. 9th, 2022._
3+
_Last updated Jan. 10th, 2022._
44

55
_This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
66
**links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
@@ -124,6 +124,7 @@ appearing elsewhere in the text. The remaining cases (2 out of 100) are mostly w
124124

125125
The below table shows occurrence of glyphs in slots for regular terms [{2}](#Note2).
126126

127+
<a id="GliphCountImg" />
127128
![Table with glyph count by slot.](images/Char Count by Slot.PNG)
128129

129130
As expected, the distribution of glyphs in slots varies based on Currier language and illustration:
@@ -146,7 +147,7 @@ Below, I analyze more in detail some relationships between glyphs, as they appea
146147

147148
#### Rare Characters
148149

149-
Some EVA characters appears in the original interlinear transliteration very seldom[{3}](#Note3), end even less frequently in the concordance version used,
150+
Some EVA characters seldom appear in the original interlinear transliteration[{3}](#Note3), end even less frequently in the concordance version used,
150151
where they appear mostly as single characters, as shown in the table below (which also considers "unreadable" tokens).
151152
For this reason, I decided to ignore these characters and mark them as "unreadable character" for this analysis.
152153

@@ -176,7 +177,7 @@ However:
176177

177178
This leads me to think pedestalled gallows are Voynich characters in their own, and not ligatures.
178179

179-
In addition, the character 'c' appears outside of the pedestal or pedestalled gallows only 7 times ('c', 'oc','chcpar', 'ckshy', 'ocfshy', 'cs?t?eey', and 'o?cs'); similarly, the character 'h' appears outside of the pedestal, the pedestalled gallows or the "plumed" pedestal ('sh') only 4 times ('theody', 'docfhhy', 'cfhhy', adn 'd?ithy'). This seems a strong indication that EVA 'c' and 'h' do not correspond to Voynich characters[{4}](#Note4)[{5}](#Note5).
180+
In addition, the character 'c' appears outside of the pedestal or pedestalled gallows only 7 times ('c', 'oc', 'chcpar', 'ckshy', 'ocfshy', 'cs?t?eey', and 'o?cs'); similarly, the character 'h' appears outside of the pedestal, the pedestalled gallows or the "plumed" pedestal only 4 times ('theody', 'docfhhy', 'cfhhy', and 'd?ithy'). This seems a strong indication that EVA 'c' and 'h' do not correspond to Voynich characters[{4}](#Note4)[{5}](#Note5).
180181

181182

182183
#### 'e' and 'i'

docs/006/index.md

Lines changed: 25 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Note 006 - Works on Word Structure
22

3-
_Last updated Jan. 9th, 2022._
3+
_Last updated Jan. 16th, 2022._
44

5-
_This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
5+
_This note refers to [release v.7.0.0](https://github.com/mzattera/v4j/tree/v.7.0.0) of v4j;
66
**links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
77
In addition, some of this note content might have become obsolete in more recent versions of the library._
88

@@ -19,7 +19,6 @@ _Please refer to the [home page](..) for a set of definitions that might be rele
1919
In this page I will list, review, and comment works from different authors about the inner structure of Voynich words.
2020
When appropriate, I will compare their findings with my [slots concept](../005).
2121

22-
I expect these notes to grow and refine over time (as for the others, to be honest).
2322
Number in square brackets indicate the date when corresponding works were published (as far as I can determine it).
2423

2524

@@ -88,25 +87,32 @@ Similarly, it can be seen that gallows in slots 3 and 7, which belong to the cor
8887

8988
Stolfi notes: "_When designing the grammar, we tried to strike a useful balance between a simple and informative model and one that would cover as much of the corpus as possible. ... Conversely, the grammar is probably too permissive in many points, so that many words that it classifies as normal are in fact errors or non-word constructs_". It should be noted that the grammar is really good in parsing Voynichese
9089
(accordingly to Solfi it covers "_over 96.5% of all the tokens (word instances) in the text_") but,
91-
on the other side, it is also very bad in recognizing what is not Voynichese; the grammar accepts something in the order of 1.4e20 (100 billions of billions) different terms, only about 4'500 of which are terms in the manuscript ([concordance version](https://github.com/mzattera/v4j#ivtff)). Just for comparison, all the words that can be generated by the slot model amount at a total of 16'753'291 (13 order of magnitude less) of which around 2'800 are Voynich terms; the model covers slightly more than 88% of tokens (98% considering separable terms) but it is much easier to describe and understand.
90+
on the other side, it is also very bad in recognizing what is not Voynichese; the grammar accepts something in the order of 1.4e20 (100 billions of billions) different terms[{1}](#Note1), only about 4'500 of which are terms in the manuscript ([concordance version](https://github.com/mzattera/v4j#ivtff)). Just for comparison, all the words that can be generated by the slot model amount at a total of 16'753'291 (13 order of magnitude less) of which around 2'800 are Voynich terms; the model covers slightly more than 88% of tokens (98% considering separable terms) but it is much easier to describe and understand.
9291

93-
In summary, I do agree with Stolfi (and other authors) that the order in which characters appears in Voynich
94-
words is not arbitrary, but I think his model is misleading in suggesting a "layered" structure; for example,
95-
word prefixes and suffixes, which in Solfi's model both belong to the same layer (the crust), are indeed very different and assigning them to the same word structure looks completely arbitrary; ultimately, it seems
96-
the grammar suggests a "sequence" of possible characters, rather than a "onion-like" structure for words.
97-
If this is the case, it must be said the other, much simpler, models in this page show the same overall
98-
structure of Voynich words even if in less details or with less coverage of Voynich terms.
99-
Regarding the fine details, these might not be as relevant as Stolfi admits that "_one should not give too much weight to the finer divisions and associations implied by our parse trees_". It should also be mentioned that the grammar
100-
looks unnecessary complex, mostly because of the way it handles "circles"; this makes very difficult to grasp the structure of Voynichese below the most superficial levels by looking at the grammar. This is further complicated by the fact that the huge majority of the words the grammar describes are clearly very different by those we found in the text.
92+
In summary:
93+
94+
- I do agree with Stolfi (and other authors) that the order in which characters appears in Voynich
95+
words is not arbitrary.
96+
97+
- From a high-level structure point of view, the "layered" structure Stolfi proposes is debatable. For example,
98+
word prefixes and suffixes, which in Solfi's model both belong to the same layer (the crust), are indeed very different and assigning them to the same word structure looks completely arbitrary. Similarly, divisions between mantle and core are unclear.
10199

100+
- From a fine-grain structure point of view, the grammar gets complex in an attempt to parse "circles" and 'e' sequence accordingly to
101+
rules, that, Stolfi agrees are arbitrary; he also adds: "_one should not give too much weight to the finer divisions and associations
102+
implied by our parse trees_".
103+
104+
- The grammar has a very good coverage of the Voynich vocabulary but it does it at the expenses of selectivity. In other words,
105+
the grammar recognizes a huge amount of words that are not in the text and that clearly do not look like
106+
Voynichese (e.g. words starting with 'yqaynoyx-').
102107

103108

104-
# Philip Neal [?]
109+
110+
# Philip Neal
105111

106112
After writing [working note 005](../005), I realized Philip Neal published a [very similar concept](http://philipneal.net/voynichsources/transcription_neva_spaced/);
107113
his point was that this could be the result of using a grille to produce the text, something similar to the more complete approach described in [RUGG (2004)](../biblio.md).
108114

109-
Neal confirmed [{1}](#Note1) that his grid scheme (which pre-dates my notes) is much the same as my slot concept and,
115+
Neal confirmed [{2}](#Note2) that his grid scheme (which pre-dates my notes) is much the same as my slot concept and,
110116
though he only put f103r on his website, he has analyzed every page of the manuscript along the same lines.
111117

112118
He also mentions he knows of at least another researcher coming to similar conclusions independently.
@@ -147,7 +153,7 @@ So NEVA and the Slot alphabet have different objectives, as my proposal aims at
147153
"fine distinctions" in Glen Claston's Voynich 101.
148154

149155

150-
# Sean B. Palmer [2004?]
156+
# Sean B. Palmer [2004]
151157

152158
I found the below grammar attributed to Palmer by [Pelling](http://ciphermysteries.com/2010/11/22/sean-palmers-voynichese-word-generator) (see also below):
153159

@@ -170,7 +176,7 @@ O = o
170176
Accordingly to Pelling, Palmer claims this grammar can generate 97% of Voynichese words, but this is clearly (as Pelling says) because it generates a lot of words (potentially infinite strictly looking at the grammar).
171177

172178

173-
# Elmar Vogt [2009?]
179+
# Elmar Vogt [2009]
174180

175181
Created a [grammar](https://voynichthoughts.wordpress.com/grammar/) for existing words in the Voynich Manuscript by analyzing the
176182
stars section of the Voynich, which is written in Currier's B language.
@@ -193,7 +199,9 @@ This pattern is fundamentally based on shapes of individual glyphs but also info
193199

194200
**Notes**
195201

196-
<a id="Note1">**{1}**</a> Personal communication, October 2021.
202+
<a id="Note1">**{1}**</a> Project [io.github.mzattera.v4j.cmc]() is a [Xtext](https://www.eclipse.org/Xtext/) project with a simple grammar that reads Stolfi's grammar and counts the number of terms it accepts. There is also some code to alternatively generate these terms in various way. Keep in mind, the number of these terms is huge. A local version of Stolfi's grammar suitable for parsing can be found [here]().
203+
204+
<a id="Note2">**{2}**</a> Personal communication, October 2021.
197205

198206

199207
---

docs/007/index.md

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Note 007 - A Graph View on Word Structure
22

3-
_Last updated Oct. 24th, 2021._
3+
_Last updated Jan. 13th, 2021._
44

5-
_This note refers to [release v.5.0.0](https://github.com/mzattera/v4j/tree/v.5.0.0) of v4j;
5+
_This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
66
**links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
77
In addition, some of this note content might have become obsolete in more recent versions of the library._
88

@@ -17,34 +17,34 @@ _Please refer to the [home page](..) for a set of definitions that might be rele
1717

1818

1919
# Abstract
20+
**At this stage, this note is a placeholder of a work in progress I should finalize ASAP.**
21+
**IMAGES AND RELATIVE COMMENTS MUST BE REFRESHED AND VALIDATED**
2022

2123

2224
# Methodology
2325

2426
This work builds on my [slot model](../005) for Voynich words.
25-
** Unless differently noted, this pages uses the Slot alphabet to transliterate Voynich words. **
27+
**Unless differently noted, this pages uses the Slot alphabet to transliterate Voynich words.**
2628

27-
I created a graph where nodes are charters in their slots; e.g. "1_o" represent character 'o' in slot number 1.
29+
I created a graph[{1}](#Note1) where nodes are charters in their slots; e.g. "1_o" represent character 'o' in slot number 1.
2830

2931
After that, I connected node A with node B if there is a regular term in the Voynich where character B follows directly character A;
3032
the connection is a directed edge with a weight equal the number of terms where the characters are connected.
31-
For visualization purposes I remove all edges with a weight less than 10.
33+
For visualization purposes I remove all edges with a weight less than 10[{2}](#Note2).
3234

3335
Final note, when possible I push characters to the rightmost available slot.
3436

3537
The resulting graph is shown below and commented further.
3638

3739
![Complete word structure graph.](images/Complete.PNG)
3840

39-
** To see the pictures properly, right click on them and open them in a different tab. **
41+
**To see the pictures properly, right click on them and open them in a different tab.**
4042

4143

4244
# Analysis
4345

4446
Here i analyze char connections slot by slot.
4547

46-
LE IMMAGINI SONO DA RIFARE (E ANCHE QUALCHE CONCLUSIONE)
47-
4848
## Slot 0
4949

5050
Characters in slot 0 behave quite different one another.
@@ -167,6 +167,17 @@ Noticeable difference is that, while 'l' and 'r' can be followed by the word fin
167167
This slot contains the word ending 'y' alone.
168168

169169
![11_d](images/11_d.PNG)
170+
171+
---
172+
173+
**Notes**
174+
175+
<a id="Note1">**{1}**</a> Class [`io.github.mzattera.v4j.applications.slot.BuildSlotStateMachine`]() was used to generate the graph,
176+
that was then visualized using [Gephi](https://gephi.org/).
177+
178+
<a id="Note2">**{2}**</a> Please notice that, as you can see
179+
from the [glyph count by slot](../005/#GliphCountImg), some glyphs appear in less than 1% of the terms, that means they will
180+
have less than 28 total incoming connection, therefore they might look unconnected in this graph.
170181

171182

172183
---

eclipse/io.github.mzattera.v4j-apps/.classpath

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
<attribute name="maven.pomderived" value="true"/>
1919
</attributes>
2020
</classpathentry>
21-
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-15">
21+
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-11">
2222
<attributes>
2323
<attribute name="maven.pomderived" value="true"/>
2424
</attributes>
Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
eclipse.preferences.version=1
22
org.eclipse.jdt.core.compiler.codegen.inlineJsrBytecode=enabled
33
org.eclipse.jdt.core.compiler.codegen.methodParameters=do not generate
4-
org.eclipse.jdt.core.compiler.codegen.targetPlatform=15
4+
org.eclipse.jdt.core.compiler.codegen.targetPlatform=11
55
org.eclipse.jdt.core.compiler.codegen.unusedLocal=preserve
6-
org.eclipse.jdt.core.compiler.compliance=15
6+
org.eclipse.jdt.core.compiler.compliance=11
77
org.eclipse.jdt.core.compiler.debug.lineNumber=generate
88
org.eclipse.jdt.core.compiler.debug.localVariable=generate
99
org.eclipse.jdt.core.compiler.debug.sourceFile=generate
@@ -13,4 +13,4 @@ org.eclipse.jdt.core.compiler.problem.enumIdentifier=error
1313
org.eclipse.jdt.core.compiler.problem.forbiddenReference=warning
1414
org.eclipse.jdt.core.compiler.problem.reportPreviewFeatures=warning
1515
org.eclipse.jdt.core.compiler.release=disabled
16-
org.eclipse.jdt.core.compiler.source=15
16+
org.eclipse.jdt.core.compiler.source=11

eclipse/io.github.mzattera.v4j-apps/pom.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
<description>"Applications" using v4j Java to play with the Voynich manuscript.</description>
1010
<properties>
1111
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
12-
<maven.compiler.source>15</maven.compiler.source>
13-
<maven.compiler.target>15</maven.compiler.target>
12+
<maven.compiler.source>11</maven.compiler.source>
13+
<maven.compiler.target>11</maven.compiler.target>
1414
</properties>
1515
</project>
Binary file not shown.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<classpath>
3+
<classpathentry kind="src" path="src"/>
4+
<classpathentry kind="src" path="src-gen"/>
5+
<classpathentry kind="src" path="xtend-gen"/>
6+
<classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER/org.eclipse.jdt.internal.debug.ui.launcher.StandardVMType/JavaSE-11"/>
7+
<classpathentry kind="con" path="org.eclipse.pde.core.requiredPlugins"/>
8+
<classpathentry kind="output" path="bin"/>
9+
</classpath>
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
/bin/
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
2+
<launchConfiguration type="org.eclipse.emf.mwe2.launch.Mwe2LaunchConfigurationType">
3+
<stringAttribute key="org.eclipse.debug.core.ATTR_REFRESH_SCOPE" value="${working_set:&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&#13;&#10;&lt;resources&gt;&#13;&#10;&lt;item path=&quot;/io.github.mzattera.v4j.cmc.count&quot; type=&quot;4&quot;/&gt;&#13;&#10;&lt;/resources&gt;}"/>
4+
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
5+
<listEntry value="/io.github.mzattera.v4j.cmc"/>
6+
</listAttribute>
7+
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
8+
<listEntry value="4"/>
9+
</listAttribute>
10+
<listAttribute key="org.eclipse.debug.ui.favoriteGroups">
11+
<listEntry value="org.eclipse.debug.ui.launchGroup.debug"/>
12+
<listEntry value="org.eclipse.debug.ui.launchGroup.run"/>
13+
</listAttribute>
14+
<booleanAttribute key="org.eclipse.jdt.launching.ATTR_ATTR_USE_ARGFILE" value="false"/>
15+
<booleanAttribute key="org.eclipse.jdt.launching.ATTR_SHOW_CODEDETAILS_IN_EXCEPTION_MESSAGES" value="true"/>
16+
<booleanAttribute key="org.eclipse.jdt.launching.ATTR_USE_CLASSPATH_ONLY_JAR" value="false"/>
17+
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="org.eclipse.emf.mwe2.launch.runtime.Mwe2Launcher"/>
18+
<stringAttribute key="org.eclipse.jdt.launching.MODULE_NAME" value="io.github.mzattera.v4j.cmc.count"/>
19+
<stringAttribute key="org.eclipse.jdt.launching.PROGRAM_ARGUMENTS" value="io.github.mzattera.v4j.cmc.count.GenerateCmcCounter"/>
20+
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="io.github.mzattera.v4j.cmc"/>
21+
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-Xmx512m"/>
22+
</launchConfiguration>
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<projectDescription>
3+
<name>io.github.mzattera.v4j.cmc</name>
4+
<comment></comment>
5+
<projects>
6+
</projects>
7+
<buildSpec>
8+
<buildCommand>
9+
<name>org.eclipse.xtext.ui.shared.xtextBuilder</name>
10+
<arguments>
11+
</arguments>
12+
</buildCommand>
13+
<buildCommand>
14+
<name>org.eclipse.jdt.core.javabuilder</name>
15+
<arguments>
16+
</arguments>
17+
</buildCommand>
18+
<buildCommand>
19+
<name>org.eclipse.pde.ManifestBuilder</name>
20+
<arguments>
21+
</arguments>
22+
</buildCommand>
23+
<buildCommand>
24+
<name>org.eclipse.pde.SchemaBuilder</name>
25+
<arguments>
26+
</arguments>
27+
</buildCommand>
28+
</buildSpec>
29+
<natures>
30+
<nature>org.eclipse.xtext.ui.shared.xtextNature</nature>
31+
<nature>org.eclipse.jdt.core.javanature</nature>
32+
<nature>org.eclipse.pde.PluginNature</nature>
33+
</natures>
34+
</projectDescription>
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
eclipse.preferences.version=1
2+
encoding/<project>=UTF-8
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
eclipse.preferences.version=1
2+
org.eclipse.jdt.core.compiler.codegen.inlineJsrBytecode=enabled
3+
org.eclipse.jdt.core.compiler.codegen.targetPlatform=11
4+
org.eclipse.jdt.core.compiler.compliance=11
5+
org.eclipse.jdt.core.compiler.problem.assertIdentifier=error
6+
org.eclipse.jdt.core.compiler.problem.enablePreviewFeatures=disabled
7+
org.eclipse.jdt.core.compiler.problem.enumIdentifier=error
8+
org.eclipse.jdt.core.compiler.problem.reportPreviewFeatures=warning
9+
org.eclipse.jdt.core.compiler.release=enabled
10+
org.eclipse.jdt.core.compiler.source=11

0 commit comments

Comments
 (0)