79bba70370
This is nearly a full rewrite of the alignment code. Position is now based on the line baseline (provided by Tesseract) and the font size is smarter (defaulting to Tesseract's provided value with various adjustments). The goals were: - Have Ctrl+F highlight the word as accurately as possible. - Have Ctrl+A/Ctrl+C end up with text that matches the original as closely as possible. - Have PdfSharp and Pdfium produce consistent output. On my test cases all goals are fully met. #236 |
||
---|---|---|
.. | ||
_doc | ||
Images | ||
ImportExport | ||
Lang/Resources | ||
Ocr | ||
Platform | ||
Remoting | ||
Scan | ||
Serialization | ||
Testing | ||
Threading | ||
Unmanaged | ||
Util | ||
.gitignore | ||
LICENSE | ||
NAPS2.Sdk.csproj | ||
README.md |
NAPS2.Sdk
NAPS2.Sdk is a fully-featured scanning library, supporting WIA, TWAIN, SANE, and ESCL scanners on Windows, Mac, and Linux.
Packages
NAPS2.Sdk is modular, and depending on your needs you may have to reference a different set of packages.
Required Packages
- NAPS2.Sdk
- Contains core scanning functionality for all platforms.
- Exactly one of:
- NAPS2.Images.Gdi
- For working with
System.Drawing.Bitmap
images. (Windows Forms)
- For working with
- NAPS2.Images.Wpf
- For working with
System.Windows.Media.Imaging
images. (WPF)
- For working with
- NAPS2.Images.Gtk
- For working with
Gdk.Pixbuf
images. (Linux)
- For working with
- NAPS2.Images.Mac
- For working with
AppKit.NSImage
images. (Mac)
- For working with
- NAPS2.Images.ImageSharp
- For working with
ImageSharp
images.
- For working with
- NAPS2.Images.Gdi
Optional Packages
- NAPS2.Sdk.Worker.Win32
- For scanning with TWAIN on Windows.
- NAPS2.Pdfium.Binaries
- For importing PDFs.
- NAPS2.Sane.Binaries
- For using SANE drivers on Mac. (Linux has them pre-installed, and Windows isn't supported.)
- NAPS2.Tesseract.Binaries
- For running OCR. (You can also use a separate Tesseract installation if you like.)
- NAPS2.Escl.Server
- For sharing scanners across the local network.
Usage
// Set up
using var scanningContext = new ScanningContext(new GdiImageContext());
var controller = new ScanController(scanningContext);
// Query for available scanning devices
var devices = await controller.GetDeviceList();
// Set scanning options
var options = new ScanOptions
{
Device = devices.First(),
PaperSource = PaperSource.Feeder,
PageSize = PageSize.A4,
Dpi = 300
};
// Scan and save images
int i = 1;
await foreach (var image in controller.Scan(options))
{
image.Save($"page{i++}.jpg");
}
// Scan and save PDF
var images = await controller.Scan(options).ToListAsync();
var pdfExporter = new PdfExporter(scanningContext);
await pdfExporter.Export("doc.pdf", images);
More samples:
- "Hello World" scanning
- Scan and save to PDF/images
- Scan with TWAIN drivers
- Scan to System.Drawing.Bitmap
- Import and export PDFs
- Export PDFs with OCR
- Store image data on the filesystem
- Share scanners on the local network
Also see:
Drivers
Windows | Mac | Linux | |
---|---|---|---|
WIA | X | ||
TWAIN | X | * | |
Apple | X | ||
SANE | X | X | |
ESCL | X | X | X |
WIA (Windows Image Acquisition) is a Microsoft technology for scanners (and cameras). Many scanners provide WIA drivers for Windows.
TWAIN is a cross-platform standard for image acquisition. Many scanners provide TWAIN drivers for Windows and/or Mac.
Apple's ImageCaptureCore provides access to TWAIN and ESCL scanners on Mac devices.
SANE is an open-source API and set of backends for various scanners. Primarily for Linux, supported devices use backends made by open-source contributors or the manufacturer themselves.
ESCL, also known as Apple AirScan, is a standard protocol for scanning over a network. Many modern scanners support ESCL, and as it's a network protocol, specific drivers aren't required. ESCL can also be used over a USB connection in some cases.
Choosing a Driver
Each platform has a default driver (WIA on Windows, Apple on Mac, and SANE on Linux). To use another driver, you only need to specify it when querying for devices:
var devices = await controller.GetDeviceList(Driver.Twain);
Worker Processes
Using the TWAIN driver on Windows usually requires the calling process to be 32-bit. If you want to use TWAIN from a 64-bit process, NAPS2 provides a 32-bit worker process:
// Reference the NAPS2.Sdk.Worker.Win32 package and call this method
scanningContext.SetUpWin32Worker();
Contributing
Looking to contribute to NAPS2 or NAPS2.Sdk? Have a look at the wiki.