naps2/NAPS2.Sdk/Ocr
Ben Olden-Cooligan 79bba70370 Improve OCR text alignment
This is nearly a full rewrite of the alignment code. Position is now based on the line baseline (provided by Tesseract) and the font size is smarter (defaulting to Tesseract's provided value with various adjustments).

The goals were:
- Have Ctrl+F highlight the word as accurately as possible.
- Have Ctrl+A/Ctrl+C end up with text that matches the original as closely as possible.
- Have PdfSharp and Pdfium produce consistent output.
On my test cases all goals are fully met.

#236
2024-03-26 19:05:17 -07:00
..
IOcrEngine.cs Add xml doc and remove unnecessary types 2023-12-30 10:35:09 -08:00
OcrController.cs Make OcrController internal 2023-12-30 12:30:24 -08:00
OcrErrorEventArgs.cs Use Microsoft.Extensions.Logging for SDK logging 2023-04-06 19:25:08 -07:00
OcrEventArgs.cs Add OcrOperationManager and more OcrRequestQueue tests 2022-06-13 21:31:55 -07:00
OcrMode.cs Add a separate checkbox for OCR preprocessing 2024-02-25 11:55:08 -08:00
OcrParams.cs Sdk: Make OCR sdk-friendly 2023-12-29 14:15:58 -08:00
OcrPriority.cs Finish OcrRequestQueue tests 2022-06-17 00:14:07 -07:00
OcrRequest.cs Use Microsoft.Extensions.Logging for SDK logging 2023-04-06 19:25:08 -07:00
OcrRequestParams.cs OCR after scanning WIP 2022-07-24 18:08:50 -07:00
OcrRequestQueue.cs Tune OCR worker count 2024-01-05 10:22:06 -08:00
OcrRequestState.cs Finish OcrRequestQueue tests 2022-06-17 00:14:07 -07:00
OcrResult.cs Improve OCR text alignment 2024-03-26 19:05:17 -07:00
OcrResultElement.cs Improve OCR text alignment 2024-03-26 19:05:17 -07:00
StubOcrEngine.cs More API cleanup 2023-08-05 09:05:23 -07:00
TesseractOcrEngine.cs Improve OCR text alignment 2024-03-26 19:05:17 -07:00