neovim
diff --git a/‎README.md
Lines changed: 54 additions & 0 deletions b/‎README.md
Lines changed: 54 additions & 0 deletions
diff --git a/‎corpus/arguments.txt
Lines changed: 38 additions & 16 deletions b/‎corpus/arguments.txt
Lines changed: 38 additions & 16 deletions
diff --git a/‎corpus/backtick.txt
Lines changed: 0 additions & 45 deletions b/‎corpus/backtick.txt
Lines changed: 0 additions & 45 deletions
diff --git a/‎corpus/code_block.txt
Lines changed: 0 additions & 96 deletions b/‎corpus/code_block.txt
Lines changed: 0 additions & 96 deletions
@@ -0,0 +1,54 @@
+tree-sitter-vimdoc
+==================
+
+This grammar intentionally support a subset of the vimdoc "spec"; predictable
+results are the primary goal, so that _output_ formats (e.g. HTML) are
+well-formed; the _input_ (vimdoc) is secondary. The first step should always be
+to try to fix the input (within reason) rather than insist on a grammar that
+handles vimdoc's endless quirks.
+
+Notes
+-----
+
+- vimdoc format "spec":
+    - [:help help-writing](https://neovim.io/doc/user/helphelp.html#help-writing)
+    - https://github.com/nanotee/vimdoc-notes
+- whitespace is intentionally captured in `(word)`, because it is often necessary to be
+  able to correctly layout vim help files (especially old/legacy).
+- `(codeblock)` is contained by `(line)` because `>` can start a code block at the end of a line.
+- `(column_heading)` is contained by `(line)` because `>` (to close
+  a `(codeblock)` can appear at the start of `(column_heading)`.
+- `h1` ("Heading 1"): `======` followed by text and optional `*tags*`.
+- `h2` ("Heading 2"): `------` followed by text and optional `*tags*`.
+- `h3` ("Heading 3"): only UPPERCASE WORDS, followed by optional `*tags*`.
+
+Known issues
+------------
+
+- `line_li` ("list item") is _experimental_. It doesn't support nesting yet and
+  it may not work well; you can treat it as a normal `line` for layout purposes.
+- `codeblock` ">" must not be preceded only by tabs, a space char is required (" >").
+  See `:help lcs-tab` for example. Currently the grammar doesn't enforce this.
+- `codeblock` terminated by an "implicit stop" (i.e. no terminating `<`)
+  consumes the first char of the terminating line, and continues the parent
+  `block`, preventing top-level forms like `h1`, `h2` from being recognized
+  until a blank line is encountered.
+- `line` in a `codeblock` does not contain `word` atoms, it's just the full
+  raw text line including whitespace. This is somewhat dictated by its
+  "preformatted" nature; parsing the contents would require loading a "child"
+  language (injection). See [#2](https://github.com/vigoux/tree-sitter-vimdoc/issues/2).
+- `url` doesn't handle _surrounding_ parens. E.g. `(https://example.com/#yay)` yields `word`
+- `url` doesn't handle _nested_ parens. E.g. `(https://example.com/(foo)#yay)`
+- Ideally `block_end` should consume the last block of the document _only_ if that
+  block is missing a trailing blank line or EOL ("\n").
+    - TODO: consider simply _not supporting_ docs without EOL?
+- Ideally `line_noeol` should consume the last line of the document _only_ if
+  that line is missing EOL ("\n").
+    - TODO: consider simply _not supporting_ docs without EOL?
+
+TODO
+----
+
+- `line_noeol` is a special-case to support documents that don't end in EOL.
+  Grammar could be a bit simpler if we just require EOL at end of document.
+- `line_modeline` (only at EOF)
@@ -1,31 +1,53 @@
 ================================================================================
-Simple argument
+simple argument
 ================================================================================
 This in an argument: {arg}
 --------------------------------------------------------------------------------
 
 (help_file
-  (line
-    (word)
-    (word)
-    (word)
-    (word)
-    (argument
-      (word))))
+  (block
+    (line
+      (word)
+      (word)
+      (word)
+      (word)
+      (argument
+        (word)))))
 
 ================================================================================
-Multiple arguments on the same line
+multiple arguments on the same line
 ================================================================================
-
 {foo} {bar} {baz}
 
 --------------------------------------------------------------------------------
 
 (help_file
-  (line
-    (argument
-      (word))
-    (argument
-      (word))
-    (argument
+  (block
+    (line
+      (argument
+        (word))
+      (argument
+        (word))
+      (argument
+        (word)))))
+
+================================================================================
+NOT an argument
+================================================================================
+{foo "{bar}" `{baz}` |{baz| } {}
+
+--------------------------------------------------------------------------------
+
+(help_file
+  (block
+    (line
+      (argument
+        (word)
+        (ERROR))
+      (word)
+      (codespan
+        (word))
+      (taglink
+        (word))
+      (word)
       (word))))