Commit Graph

466 Commits

Author SHA1 Message Date
Ben Olden-Cooligan
ec70a7c017 Make more NAPS2.Images classes internal 2024-04-07 17:09:22 -07:00
Ben Olden-Cooligan
3490a5234a Partially fix Pdfium PDF/A compliance and improve tests 2024-04-06 13:57:32 -07:00
Ben Olden-Cooligan
9a2537c840 Use Polyfill package to simplify net462 support 2024-04-01 18:09:47 -07:00
Ben Olden-Cooligan
1efc0eeef5 Escl: Use xunit IAsyncLifetime for start/stop in test fixture 2024-04-01 10:16:57 -07:00
Ben Olden-Cooligan
94120c441f Fix mac tests 2024-04-01 10:14:50 -07:00
Ben Olden-Cooligan
e547d35be1 Remove ImageContext from IMemoryImage constructors
Now that ImageContext is stateless it can be created on demand, simplifying a lot of things.
2024-04-01 00:25:25 -07:00
Ben Olden-Cooligan
6456d1f1e7 Escl: Add timeouts to tests 2024-03-31 22:48:34 -07:00
Ben Olden-Cooligan
39dfb1c2b9 Disable net8 tests on Windows
Not much benefit and it slows down testing.
2024-03-31 22:12:26 -07:00
Ben Olden-Cooligan
7ef70c3f0a Escl: Support HTTPS by default by generating self-signed certs 2024-03-31 21:52:06 -07:00
Ben Olden-Cooligan
e6318000b8 Add PageNumber and PageSide to PostProcessingData 2024-03-31 20:20:27 -07:00
Ben Olden-Cooligan
46577c6a44 Fix fake OcrResult creation 2024-03-27 23:33:18 -07:00
Ben Olden-Cooligan
c07bcddf3b Escl: HTTPS support, security policies, and HTTPS->HTTP fallback
#338
2024-03-27 23:25:25 -07:00
Ben Olden-Cooligan
79bba70370 Improve OCR text alignment
This is nearly a full rewrite of the alignment code. Position is now based on the line baseline (provided by Tesseract) and the font size is smarter (defaulting to Tesseract's provided value with various adjustments).

The goals were:
- Have Ctrl+F highlight the word as accurately as possible.
- Have Ctrl+A/Ctrl+C end up with text that matches the original as closely as possible.
- Have PdfSharp and Pdfium produce consistent output.
On my test cases all goals are fully met.

#236
2024-03-26 19:05:17 -07:00
Ben Olden-Cooligan
431a894fed Upgrade packages 2024-03-19 17:01:32 -07:00
Ben Olden-Cooligan
fc58f48c91 Simplify twain type names 2024-03-17 17:51:01 -07:00
Ben Olden-Cooligan
bda067fe6f Use NAPS2.NTwain package 2024-03-09 19:50:06 -08:00
Ben Olden-Cooligan
3669c2df56 Use NAPS2.PdfSharp package 2024-03-09 19:47:05 -08:00
Ben Olden-Cooligan
22f48b27e0 Implement grayscale bilateral filter 2024-03-03 10:31:31 -08:00
Ben Olden-Cooligan
b7140b8b09 Fix warnings and comments 2024-03-01 10:43:21 -08:00
Ben Olden-Cooligan
05b56cdc0a Upgrade NAPS2.Tesseract.Binaries to 1.2.0
Tesseract 5.2.0 -> 5.3.4
2024-02-17 20:33:56 -08:00
Ben Olden-Cooligan
8906f5b76e Provide IPdfRenderer from ProcessedImage 2024-02-06 20:22:48 -08:00
Ben Olden-Cooligan
fe5dd21a64 Disable missing font test case on Windows 2024-02-04 19:44:45 -08:00
Ben Olden-Cooligan
e50e4ca8e6 Fix alignment of test ocr text 2024-02-03 17:09:12 -08:00
Ben Olden-Cooligan
2adbf79e8c Normalize whitespace when comparing PDF text 2024-01-30 23:03:24 -08:00
Ben Olden-Cooligan
efaa397db0 Tweak threshold for Pdfium font embedding file size test 2024-01-30 21:08:17 -08:00
Ben Olden-Cooligan
a5ec457e52 Upgrade NAPS2.Pdfium.Binaries to 1.1.0 2024-01-30 18:31:36 -08:00
Ben Olden-Cooligan
7863336f01 Use font subsets for Pdfium OCR exporting 2024-01-21 12:54:53 -08:00
Ben Olden-Cooligan
42e4cb7873 Add a Syriac test case 2024-01-21 10:19:05 -08:00
Ben Olden-Cooligan
1a921a0244 Comprehensive language->script->font mappings 2024-01-20 21:45:19 -08:00
Ben Olden-Cooligan
8a04a966c3 Pick PDF fonts for Linux 2024-01-20 19:13:12 -08:00
Ben Olden-Cooligan
1c91c624f2 Pick PDF font based on OCR language 2024-01-20 18:32:55 -08:00
Ben Olden-Cooligan
c9be81b375 Add (failing) tests for PDF font handling 2024-01-19 21:56:30 -08:00
Ben Olden-Cooligan
488429ea8d Escl: Change device ID to be just the UUID 2024-01-12 19:25:41 -08:00
Ben Olden-Cooligan
f4f782a4e9 Fix tests on non-windows 2024-01-11 19:26:07 -08:00
Ben Olden-Cooligan
5d7c72e223 Console: Add --rotate option
#252
2024-01-10 20:38:15 -08:00
Ben Olden-Cooligan
ea1bd7e33f Parse well-known page sizes 2024-01-10 20:07:21 -08:00
Ben Olden-Cooligan
fbb71c7b65 Parse page sizes without spaces 2024-01-10 19:39:33 -08:00
Ben Olden-Cooligan
26cbaabf90 Make ImageExportHelper static 2024-01-09 21:02:33 -08:00
Ben Olden-Cooligan
85fa64427a Remove BitDepth from ImageMetadata
This is obsolete now that we have LogicalPixelFormat
2024-01-03 16:08:11 -08:00
Ben Olden-Cooligan
37032983f5 Rename ImagePixelFormat.Unsupported to Unknown 2024-01-03 15:55:02 -08:00
Ben Olden-Cooligan
0afea36ba5 Rename ImageFileFormat.Unspecified to Unknown 2024-01-03 15:53:38 -08:00
Ben Olden-Cooligan
0b93ca2a15 Calculate LogicalPixelFormat lazily 2024-01-03 15:52:19 -08:00
Ben Olden-Cooligan
8b230f5323 Improve ScanDriverException hierarchy
Specific exception types for different errors (e.g. paper jam) will allow SDK users to handle specific cases according to their business logic.
2023-12-30 12:21:44 -08:00
Ben Olden-Cooligan
8db66c0706 Split SharedDevice into internal and config-level types 2023-12-30 10:58:38 -08:00
Ben Olden-Cooligan
dd83057b48 Sdk: Make OCR sdk-friendly 2023-12-29 14:15:58 -08:00
Ben Olden-Cooligan
f66d30bbbd Remove tempFolder from TesseractOcrEngine constructor 2023-12-29 13:30:01 -08:00
Ben Olden-Cooligan
8e21493326 Wpf: Fix remaining tests 2023-12-27 21:36:54 -08:00
Ben Olden-Cooligan
ee4db52303 Unify test target frameworks for Windows 2023-12-25 07:46:15 -08:00
Ben Olden-Cooligan
0ba58617ce Escl: Quality fixes 2023-12-13 21:46:45 -08:00
Ben Olden-Cooligan
9092870ee7 Escl: Job cleanup and minor fixes 2023-12-12 20:55:09 -08:00