Incorrect word coordinates

### Environment

* **Tesseract Version**:

      tesseract --version
      tesseract 4.0.0-beta.1
      leptonica-1.76.0 (Jun 26 2018, 18:21:40) [MSC v.1900 LIB Release x86]
       libgif 5.1.4 : libjpeg 9b : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
      Found AVX
      Found SSE
    (says beta.1, but beta.3 seems to be correct)
* **Commit Number**: AppVeyor: 4.0.0-beta.3.1776
* **Platform**: Windows 10 64bit (but tesseract running as 32bit)
* **Tessdata**: tessdata-fast

We've integrated the engine in our (closed-source) application using the C API, so I cannot share the actual code.

What I do is basically just iterating over the result iterator using `RIL_WORD`, get the text, bounding box and baseline for each word and then creating a PDF out of it, with the recognized text as a red overlay and drawing the bounding boxes in green for better visibility.

But the official PDF output config has the same flaw, it's just more difficult to spot:
```
tesseract andromeda.png andromeda.tess4cli pdf
```
produces the following PDF:
[andromeda.tess4cli.pdf](https://github.com/tesseract-ocr/tesseract/files/2144491/andromeda.tess4cli.pdf)

### Files:
Behavior can be reproduced using the following PNG (which is a part of a bigger file):

![andromeda](https://user-images.githubusercontent.com/13793075/42023959-649b12ae-7ac1-11e8-8d07-1b4de8c23e51.png)

### Current Behavior:
The coordinates are correct for most words.
But for some words, there seems to be an error in computing the boundaries between the words.
Couriously, the wrong boundary is always before the last character of the previous word.
I almost looks like some kind of off-by-one error.

![grafik](https://user-images.githubusercontent.com/13793075/42024152-d47104a8-7ac1-11e8-87a1-02e937999c18.png)

![grafik](https://user-images.githubusercontent.com/13793075/42022559-e41e5d1e-7abd-11e8-9352-cde659674c38.png)

### Expected Behavior:
Result with tesseract 3:

![grafik](https://user-images.githubusercontent.com/13793075/42024227-fad1eebe-7ac1-11e8-96ce-0ecef280055d.png)

![grafik](https://user-images.githubusercontent.com/13793075/42022604-ffae1326-7abd-11e8-95e6-33854f6e8fbb.png)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect word coordinates #1712

Environment

Files:

Current Behavior:

Expected Behavior:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect word coordinates #1712

Description

Environment

Files:

Current Behavior:

Expected Behavior:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions