added frame count section (#1745)

albertoandreottiATgmail · web-flow · commit 55438ae0b6ad · 2025-03-21T08:30:36.000+01:00
diff --git a/docs/en/ocr_benchmark.md b/docs/en/ocr_benchmark.md
@@ -96,7 +96,18 @@ Note: don't forget to count multi-frame images.
 If you don't know the dimensions, but you know that your images are also uncompressed, you can use file size as proxy. 
 _Keep in mind that if you actually have compressed images you will be underestimating the processing time_.
 
-
+#### Why the numbers are related to uncompressed images
+The variation of the effective compression levels in real datasets makes it impractical to use such a metric in an estimation. Some datasets with low density of information can get compression levels up to 30X, while some others can have lower compressions, so picking a reference dataset is very difficult for the general case. </br>
+But not all hope is lost, you can work with frames!.
+
+
+#### Using total frame count as a proxy for estimation
+If you don't know the size of the images, or the compression, you can estimate the number of frames and use the time it takes for a single frame(like the numbers we've shared in this document) as a proxy. 
+</br>
+Reasons for doing this:
+* You can compute frame count and sizes very efficiently using Visual NLP.
+* Even if not all frames are of the same dimensions, you can resize them prior to feeding them to the ML models in the pipeline.
+* This way, each frame will have a fixed size processing time.
 
 
 </div>