Description
Current Behavior
Hello,
I just iterate RIL_SYMBOL and got 2 problems, and I report them to find someone's help, thans very much!!
Q1. some SYMBOL
's BoundingBox
looks like overlapped.
Q2. some SYMBOL
's BoundingBox
is too high.
the output like this:
c l t r b
...
智 330 330 352 353
慧 357 329 391 354
物 392 331 414 354
流 414 329 443 353
地 443 330 459 353
图 466 331 484 354
计 891 330 914 354
算 918 329 1019 354
机 951 325 977 366
视 976 329 1005 354
觉 1004 329 1019 354
智 81 488 119 512
能 121 488 132 512
硬 144 489 168 512
件 169 489 185 512
数 615 488 639 512
字 642 488 676 513
化 676 489 692 512
here I can see:
Q1: 算 918 329 1019 354
, cover 机 951 325 977 366
completely.
Q2: 机 951 325 977 366
, its height is 41, but on real, its height is same with others height is only about 24 pixels.
here is the commandline:
./main ./demo.png ./tessdata chi_sim
here is the code:
#include <memory>
#include <string>
#include <stdio.h>
#include <tesseract/capi.h>
#include <leptonica/allheaders.h>
static int ocr(const std::string &image_path, const std::string &tessdata, const std::string &lang)
{
auto api = std::shared_ptr<TessBaseAPI>(
TessBaseAPICreate(),
[](TessBaseAPI *p) { TessBaseAPIDelete(p); }
);
if (api->Init(tessdata.c_str(), lang.c_str()))
{
fprintf(stderr, "Could not initialize tesseract.\n");
return -1;
}
auto image = std::shared_ptr<Pix>(
pixRead(image_path.c_str()),
[](Pix *p) { pixDestroy(&p); }
);
api->SetImage(image.get());
if (api->Recognize(nullptr))
{
fprintf(stderr, "Recognize failed\n");
return -1;
}
auto res_it = std::shared_ptr<tesseract::ResultIterator>(api->GetIterator());
fprintf(stderr, "%4s %4s %4s %4s %4s\n", "c", "l", "t", "r", "b");
while (!res_it->Empty(tesseract::RIL_TEXTLINE))
{
if (res_it->Empty(tesseract::RIL_WORD))
{
res_it->Next(tesseract::RIL_WORD);
continue;
}
int line_bbox[4], word_bbox[4];
int line_conf, word_conf;
res_it->BoundingBox(tesseract::RIL_TEXTLINE, &line_bbox[0], &line_bbox[1], &line_bbox[2], &line_bbox[3]);
res_it->BoundingBox(tesseract::RIL_WORD, &word_bbox[0], &word_bbox[1], &word_bbox[2], &word_bbox[3]);
line_conf = res_it->Confidence(tesseract::RIL_TEXTLINE);
word_conf = res_it->Confidence(tesseract::RIL_WORD);
// auto line_box = std::shared_ptr<Box>(
// boxCreate(line_bbox[0], line_bbox[1], line_bbox[2] - line_bbox[0], line_bbox[3] - line_bbox[1]),
// [](Box *p){ boxDestroy(&p);}
// );
// pixRenderBoxArb(image.get(), line_box.get(), 1, 0xff, 0xff, 0);
// auto word_box = std::shared_ptr<Box>(
// boxCreate(word_bbox[0], word_bbox[1], word_bbox[2] - word_bbox[0], word_bbox[3] - word_bbox[1]),
// [](Box *p){ boxDestroy(&p);}
// );
// pixRenderBoxArb(image.get(), word_box.get(), 1, 0, 0xff, 0);
do
{
int char_bbox[4];
res_it->BoundingBox(tesseract::RIL_SYMBOL, &char_bbox[0], &char_bbox[1], &char_bbox[2], &char_bbox[3]);
auto text = std::shared_ptr<char>(res_it->GetUTF8Text(tesseract::RIL_SYMBOL));
fprintf(stderr, "%4s %4d %4d %4d %4d\n",
text.get(), char_bbox[0], char_bbox[1], char_bbox[2], char_bbox[3]);
auto box = std::shared_ptr<Box>(
boxCreate(char_bbox[0], char_bbox[1], char_bbox[2] - char_bbox[0], char_bbox[3] - char_bbox[1]),
[](Box *p){ boxDestroy(&p);}
);
pixRenderBoxArb(image.get(), box.get(), 1, 0, 0, 0xff);
res_it->Next(tesseract::RIL_SYMBOL);
} while (!res_it->Empty(tesseract::RIL_BLOCK) && !res_it->IsAtBeginningOf(tesseract::RIL_WORD));
}
const auto ocr_box_image_path = image_path + ".ocr_box.png";
if (pixWrite(ocr_box_image_path.c_str(), image.get(), IFF_PNG))
{
fprintf(stderr, "Failed to write ocr box image to %s\n", ocr_box_image_path.c_str());
return -1;
}
return 0;
}
int main(int argc, char **argv)
{
if (argc != 4)
{
fprintf(stderr, "Usage: %s <image_path> <tessdata_path> <lang>\n", argv[0]);
return 1;
}
return ocr(argv[1], argv[2], argv[3]);
}
and I upload the origin image, and draw BoundingBox
image to compare:
Expected Behavior
except the BoundingBox gives values with a smaller error
Suggested Fix
There is no suggested fix, I report to find some help, thanks!!
tesseract -v
tesseract 5.5.0
leptonica-1.85.0
libgif 5.2.2 : libjpeg 8d (libjpeg-turbo 3.0.4) : libpng 1.6.44 : libtiff 4.7.0 : zlib 1.3.1 : libwebp 1.4.0 : libopenjp2 2.5.3
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 201511
Found libarchive 3.7.7 zlib/1.3.1 liblzma/5.6.3 bz2lib/1.0.8 liblz4/1.10.0 libzstd/1.5.6
Found libcurl/8.11.1 OpenSSL/3.4.0 zlib/1.3.1 brotli/1.1.0 zstd/1.5.6 libidn2/2.3.7 libpsl/0.21.5 libssh2/1.11.0 nghttp2/1.64.0 nghttp3/1.6.0
Operating System
No response
Other Operating System
Manjaro Linux x86_64
uname -a
Linux assasin-21d8a009cd 6.11.11-1-MANJARO #1 SMP PREEMPT_DYNAMIC Thu, 05 Dec 2024 16:26:44 +0000 x86_64 GNU/Linux
Compiler
gcc (GCC) 14.2.1 20240910
CPU
CPU: 12th Gen Intel(R) Core(TM) i7-12700H (20) @ 4.70 GHz
Virtualization / Containers
none
Other Information
OS: Manjaro Linux x86_64
Kernel: Linux 6.11.11-1-MANJARO
Shell: zsh 5.9
Display (BOE098E): 1920x1080 @ 60 Hz in 16" [Built-in]
DE: KDE Plasma 6.2.4
WM: KWin (Wayland)
WM Theme: Breeze
Terminal: konsole 24.8.3
Terminal Font: Hack Nerd Font Mono (11pt)
CPU: 12th Gen Intel(R) Core(TM) i7-12700H (20) @ 4.70 GHz
GPU 1: NVIDIA T600 Laptop GPU
GPU 2: Intel Alder Lake-P Integrated Graphics Controller @ 1.40 GHz [Integrated]