ocrmypdf-auto

ocrmypdf-auto

Docker app from cmccambridge's Repository

Overview

[p]This container monitors an input file directory for PDF documents to process, and automatically invokes [code][strong]OCRmyPDF[/strong][/code] on each file.[/p] [p]It uses [code]inotify[/code] to monitor the input directory efficiently, and is fairly configurable.[/p] [h4]Configuration Details[/h4] [p]See the descriptions of the Unraid volumes and environment variables for highlights of the configurability of [code]ocrmypdf-auto[/code], but for details including how to specify custom commandline parameters to [code]ocrmydf[/code] itself, or custom [code]tesseract[/code] configuration files, see the full README at https://github.com/cmccambridge/ocrmypdf-auto/blob/master/README.md[/p]

Runtime arguments

Network
bridge
Privileged
false

Template configuration

Input DirectoryPathrw

Input directory from which to process files for OCR. emContainer path: code/input/code/em

Target
/input
Output DirectoryPathrw

Output directory to which post-OCR files will be written. emContainer path: code/output/code/em

Target
/output
Config DirectoryPathrw

Config/appdata directory. emContainer path: code/config/code/em

Target
/config
Default
/mnt/user/appdata/ocrmypdf-auto
Value
/mnt/user/appdata/ocrmypdf-auto
Output ModeVariable

Controls the output directory layout: br / codeMIRROR_TREE/code - (Default) Mirror the directory structure of the input directory, i.e. for an input file code/input/foo/bar.pdf/code create an output file code/output/foo/bar.pdf/code. br / codeSINGLE_FOLDER/code - Collect all output files in a single flat folder, i.e. for an input file code/input/foo/bar.pdf/code create an output file code/output/bar.pdf/code.

Target
OCR_OUTPUT_MODE
Default
MIRROR_TREE
Value
MIRROR_TREE
Action On SuccessVariable

Controls the action (if any) to perform after successful OCR processing: br / codeNOTHING/code - (Default) Do nothing. Input files remain in place where they were found. br / codeARCHIVE_INPUT_FILES/code - Archive input files by strongmoving/strong them em(overwriting existing files!)/em to the code/archive/code Volume br / codeDELETE_INPUT_FILES/code - Delete the input file after successful processing.

Target
OCR_ACTION_ON_SUCCESS
Default
NOTHING
Value
NOTHING
Additional LanguagesVariable

Additional languages (besides English) to install, given as a space-separated list of language abbreviations. All available languages can be found on the a href=https://packages.ubuntu.com/search?keywords=tesseract-ocr-&searchon=names&suite=bionic&section=allUbuntu site/a. Example for German, Chinese - Simplified, and Italian: codedeu chi-sim ita/code

Target
OCR_LANGUAGES
Notify URLVariable

On a successful completion, a POST will be made to the given URL, with a JSON payload of code{'pdf': '/output/doc.pdf', 'txt': '/output/doc.pdf.txt'} /code. The txt property will only be present if you add the code--sidecar /code option to the codeocr.config/code file. This could be used to kick off additional processing, like indexing of the content or notifications.

Target
OCR_NOTIFY_URL
Process Existing on StartupVariable

Set to code1/code to enable processing of any files in the input directory when the container is launched. br/ Set to code0/code (Default) or unset to ignore existing files until they are modified.

Target
OCR_PROCESS_EXISTING_ON_START
Default
0
Value
0
VerbosityVariable

Control the verbosity of debug logging. Accepts python codelogging/code levels, e.g. codewarn/code (Default), codeinfo/code, codedebug/code, etc.

Target
OCR_VERBOSITY
UID OverrideVariable

Set the UID that the OCR tools will run as. unRAID standard is 99.

Target
USERMAP_UID
Default
99
Value
99
GID OverrideVariable

Set the primary GID that the OCR tools will run with. unRAID standard is 100.

Target
USERMAP_GID
Default
100
Value
100

Categories

Download Statistics

336,026
Total Downloads
660
This Month
1,436
Avg / Month

Total Downloads Over Time

Loading chart...

Details

Repository
cmccambridge/ocrmypdf-auto:latest
Last Updated2021-02-04
First Seen2020-03-17

Run ocrmypdf-auto on Unraid.

ocrmypdf-auto is listed in Community Apps for Unraid OS. Explore Unraid to build a flexible home server, NAS, or homelab.