You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+5Lines changed: 5 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -233,6 +233,11 @@ This package contains some "utility" classes to deal with files, math, etc.
233
233
Please take a look what is in here before implementing anythign from scratch.
234
234
235
235
236
+
### `io.github.mzattera.v4j.cmc`
237
+
238
+
This is a [Xtext](https://www.eclipse.org/Xtext/) project created for [Note 006](https://mzattera.github.io/v4j/006/); please refer to it for more details.
239
+
240
+
236
241
### Testing
237
242
238
243
Project `io.github.mzattera.v4j-apps` contains JUnit tests for the v4j library and (some) of the "applications" in `v4j-apps`.
Copy file name to clipboardExpand all lines: docs/005/index.md
+4-3Lines changed: 4 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Note 005 - Slots and a New Alphabet
2
2
3
-
_Last updated Jan. 9th, 2022._
3
+
_Last updated Jan. 10th, 2022._
4
4
5
5
_This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
6
6
**links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
@@ -124,6 +124,7 @@ appearing elsewhere in the text. The remaining cases (2 out of 100) are mostly w
124
124
125
125
The below table shows occurrence of glyphs in slots for regular terms [{2}](#Note2).
126
126
127
+
<aid="GliphCountImg" />
127
128

128
129
129
130
As expected, the distribution of glyphs in slots varies based on Currier language and illustration:
@@ -146,7 +147,7 @@ Below, I analyze more in detail some relationships between glyphs, as they appea
146
147
147
148
#### Rare Characters
148
149
149
-
Some EVA characters appears in the original interlinear transliteration very seldom[{3}](#Note3), end even less frequently in the concordance version used,
150
+
Some EVA characters seldom appear in the original interlinear transliteration[{3}](#Note3), end even less frequently in the concordance version used,
150
151
where they appear mostly as single characters, as shown in the table below (which also considers "unreadable" tokens).
151
152
For this reason, I decided to ignore these characters and mark them as "unreadable character" for this analysis.
152
153
@@ -176,7 +177,7 @@ However:
176
177
177
178
This leads me to think pedestalled gallows are Voynich characters in their own, and not ligatures.
178
179
179
-
In addition, the character 'c' appears outside of the pedestal or pedestalled gallows only 7 times ('c', 'oc','chcpar', 'ckshy', 'ocfshy', 'cs?t?eey', and 'o?cs'); similarly, the character 'h' appears outside of the pedestal, the pedestalled gallows or the "plumed" pedestal ('sh') only 4 times ('theody', 'docfhhy', 'cfhhy', adn 'd?ithy'). This seems a strong indication that EVA 'c' and 'h' do not correspond to Voynich characters[{4}](#Note4)[{5}](#Note5).
180
+
In addition, the character 'c' appears outside of the pedestal or pedestalled gallows only 7 times ('c', 'oc','chcpar', 'ckshy', 'ocfshy', 'cs?t?eey', and 'o?cs'); similarly, the character 'h' appears outside of the pedestal, the pedestalled gallows or the "plumed" pedestal only 4 times ('theody', 'docfhhy', 'cfhhy', and 'd?ithy'). This seems a strong indication that EVA 'c' and 'h' do not correspond to Voynich characters[{4}](#Note4)[{5}](#Note5).
Copy file name to clipboardExpand all lines: docs/006/index.md
+25-17Lines changed: 25 additions & 17 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
# Note 006 - Works on Word Structure
2
2
3
-
_Last updated Jan. 9th, 2022._
3
+
_Last updated Jan. 16th, 2022._
4
4
5
-
_This note refers to [release v.6.0.0](https://github.com/mzattera/v4j/tree/v.6.0.0) of v4j;
5
+
_This note refers to [release v.7.0.0](https://github.com/mzattera/v4j/tree/v.7.0.0) of v4j;
6
6
**links to classes and files refer to this release**; files might have been changed, deleted or moved in the current master branch.
7
7
In addition, some of this note content might have become obsolete in more recent versions of the library._
8
8
@@ -19,7 +19,6 @@ _Please refer to the [home page](..) for a set of definitions that might be rele
19
19
In this page I will list, review, and comment works from different authors about the inner structure of Voynich words.
20
20
When appropriate, I will compare their findings with my [slots concept](../005).
21
21
22
-
I expect these notes to grow and refine over time (as for the others, to be honest).
23
22
Number in square brackets indicate the date when corresponding works were published (as far as I can determine it).
24
23
25
24
@@ -88,25 +87,32 @@ Similarly, it can be seen that gallows in slots 3 and 7, which belong to the cor
88
87
89
88
Stolfi notes: "_When designing the grammar, we tried to strike a useful balance between a simple and informative model and one that would cover as much of the corpus as possible. ... Conversely, the grammar is probably too permissive in many points, so that many words that it classifies as normal are in fact errors or non-word constructs_". It should be noted that the grammar is really good in parsing Voynichese
90
89
(accordingly to Solfi it covers "_over 96.5% of all the tokens (word instances) in the text_") but,
91
-
on the other side, it is also very bad in recognizing what is not Voynichese; the grammar accepts something in the order of 1.4e20 (100 billions of billions) different terms, only about 4'500 of which are terms in the manuscript ([concordance version](https://github.com/mzattera/v4j#ivtff)). Just for comparison, all the words that can be generated by the slot model amount at a total of 16'753'291 (13 order of magnitude less) of which around 2'800 are Voynich terms; the model covers slightly more than 88% of tokens (98% considering separable terms) but it is much easier to describe and understand.
90
+
on the other side, it is also very bad in recognizing what is not Voynichese; the grammar accepts something in the order of 1.4e20 (100 billions of billions) different terms[{1}](#Note1), only about 4'500 of which are terms in the manuscript ([concordance version](https://github.com/mzattera/v4j#ivtff)). Just for comparison, all the words that can be generated by the slot model amount at a total of 16'753'291 (13 order of magnitude less) of which around 2'800 are Voynich terms; the model covers slightly more than 88% of tokens (98% considering separable terms) but it is much easier to describe and understand.
92
91
93
-
In summary, I do agree with Stolfi (and other authors) that the order in which characters appears in Voynich
94
-
words is not arbitrary, but I think his model is misleading in suggesting a "layered" structure; for example,
95
-
word prefixes and suffixes, which in Solfi's model both belong to the same layer (the crust), are indeed very different and assigning them to the same word structure looks completely arbitrary; ultimately, it seems
96
-
the grammar suggests a "sequence" of possible characters, rather than a "onion-like" structure for words.
97
-
If this is the case, it must be said the other, much simpler, models in this page show the same overall
98
-
structure of Voynich words even if in less details or with less coverage of Voynich terms.
99
-
Regarding the fine details, these might not be as relevant as Stolfi admits that "_one should not give too much weight to the finer divisions and associations implied by our parse trees_". It should also be mentioned that the grammar
100
-
looks unnecessary complex, mostly because of the way it handles "circles"; this makes very difficult to grasp the structure of Voynichese below the most superficial levels by looking at the grammar. This is further complicated by the fact that the huge majority of the words the grammar describes are clearly very different by those we found in the text.
92
+
In summary:
93
+
94
+
- I do agree with Stolfi (and other authors) that the order in which characters appears in Voynich
95
+
words is not arbitrary.
96
+
97
+
- From a high-level structure point of view, the "layered" structure Stolfi proposes is debatable. For example,
98
+
word prefixes and suffixes, which in Solfi's model both belong to the same layer (the crust), are indeed very different and assigning them to the same word structure looks completely arbitrary. Similarly, divisions between mantle and core are unclear.
101
99
100
+
- From a fine-grain structure point of view, the grammar gets complex in an attempt to parse "circles" and 'e' sequence accordingly to
101
+
rules, that, Stolfi agrees are arbitrary; he also adds: "_one should not give too much weight to the finer divisions and associations
102
+
implied by our parse trees_".
103
+
104
+
- The grammar has a very good coverage of the Voynich vocabulary but it does it at the expenses of selectivity. In other words,
105
+
the grammar recognizes a huge amount of words that are not in the text and that clearly do not look like
106
+
Voynichese (e.g. words starting with 'yqaynoyx-').
102
107
103
108
104
-
# Philip Neal [?]
109
+
110
+
# Philip Neal
105
111
106
112
After writing [working note 005](../005), I realized Philip Neal published a [very similar concept](http://philipneal.net/voynichsources/transcription_neva_spaced/);
107
113
his point was that this could be the result of using a grille to produce the text, something similar to the more complete approach described in [RUGG (2004)](../biblio.md).
108
114
109
-
Neal confirmed [{1}](#Note1) that his grid scheme (which pre-dates my notes) is much the same as my slot concept and,
115
+
Neal confirmed [{2}](#Note2) that his grid scheme (which pre-dates my notes) is much the same as my slot concept and,
110
116
though he only put f103r on his website, he has analyzed every page of the manuscript along the same lines.
111
117
112
118
He also mentions he knows of at least another researcher coming to similar conclusions independently.
@@ -147,7 +153,7 @@ So NEVA and the Slot alphabet have different objectives, as my proposal aims at
147
153
"fine distinctions" in Glen Claston's Voynich 101.
148
154
149
155
150
-
# Sean B. Palmer [2004?]
156
+
# Sean B. Palmer [2004]
151
157
152
158
I found the below grammar attributed to Palmer by [Pelling](http://ciphermysteries.com/2010/11/22/sean-palmers-voynichese-word-generator) (see also below):
153
159
@@ -170,7 +176,7 @@ O = o
170
176
Accordingly to Pelling, Palmer claims this grammar can generate 97% of Voynichese words, but this is clearly (as Pelling says) because it generates a lot of words (potentially infinite strictly looking at the grammar).
171
177
172
178
173
-
# Elmar Vogt [2009?]
179
+
# Elmar Vogt [2009]
174
180
175
181
Created a [grammar](https://voynichthoughts.wordpress.com/grammar/) for existing words in the Voynich Manuscript by analyzing the
176
182
stars section of the Voynich, which is written in Currier's B language.
@@ -193,7 +199,9 @@ This pattern is fundamentally based on shapes of individual glyphs but also info
193
199
194
200
**Notes**
195
201
196
-
<aid="Note1">**{1}**</a> Personal communication, October 2021.
202
+
<aid="Note1">**{1}**</a> Project [io.github.mzattera.v4j.cmc]() is a [Xtext](https://www.eclipse.org/Xtext/) project with a simple grammar that reads Stolfi's grammar and counts the number of terms it accepts. There is also some code to alternatively generate these terms in various way. Keep in mind, the number of these terms is huge. A local version of Stolfi's grammar suitable for parsing can be found [here]().
203
+
204
+
<aid="Note2">**{2}**</a> Personal communication, October 2021.
0 commit comments