Auto-scan is a tool to automatically scan documents from HP, OCR files, optimize it and finally send it by email as attachment.
Based on python 3.6 slim buster it supports
-
✓
linux/386
-
✓
linux/amd64
-
✓
linux/arm/v5
-
✓
linux/arm/v7
-
✓
linux/arm64/v8
-
✓
linux/ppc64le
-
✓
linux/s390x
In order to work properly the container provided here needs access to DBus. Actual shortcut is to run it as privileged Docker container and map host socket to gain access to DBus (see launch script below).
Git clone actual repo:
git clone https://github.com/matbgn/auto-scan.git
Git Shallow Clone the scan-pdf repo adapted for HP scanners:
cd auto-scan
git clone --depth=1 https://github.com/matbgn/scan-pdf.git
Build docker image locally or run it in dev mode (see below):
docker build -t matbgn/auto_scan .
source: https://medium.com/@artur.klauser/building-multi-architecture-docker-images-with-buildx-27d80f7e2408
-
Make sure you have a sufficient kernel on your Linux machine >= 4.8
$ uname -r 4.15.0
-
Make sure your docker version is greater than 19.03
$ docker --version Docker version 19.03.5, build 633a0ea83
-
Make sure buildx is activated (if not run
export DOCKER_CLI_EXPERIMENTAL=enabled
)$ docker buildx version github.com/docker/buildx v0.6.3-docker 266c0eac611d64fcc0c72d80206aa364e826758d
-
Use a docker image based installation to have QEMU and binfmt-support up to date
$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
ℹ️
|
Pay attention to the fact that you’ll need to re-run that docker image after every system reboot. |
-
Create a buildx builder
$ docker buildx create --name mybuilder mybuilder $ docker buildx use mybuilder
-
You can check your newly created mybuilder with:
$ docker buildx ls NAME/NODE DRIVER/ENDPOINT STATUS PLATFORMS mybuilder * docker-container mybuilder0 unix:///var/run/docker.sock running linux/amd64, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6 default docker default default running linux/amd64, linux/386, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/arm/v7, linux/arm/v6
-
|
If it only reports support for linux/amd64 and linux/386 you either still haven’t met all software requirements, or you had created a builder before you have met the software requirements. In the latter case remove it with docker buildx rm and recreate it. |
-
Finally build it locally in a tar to be loaded later on the target machine
docker buildx build -t matbgn/auto_scan --platform linux/arm64 -o type=docker,dest=- . > auto_scan_arm64.tar
-
To load an image on the target machine, simply run
docker load < auto_scan_arm64.tar
-
Then double-check the list of your pulled images with
docker images
-
-
Printer address in following format:
PRINTER_ADDRESS="hpaio:/net/OfficeJet_Pro_7740_series?ip=192.168.14.113"
Scanning mode supported:
# ADF|Duplex|Flatbed
SCAN_MODE=ADF
Run multiple batches and merge them in one single PDF:
# This argument is optionnal
BATCH_TOTAL=2
Change paper format scanning (A3, A4, A5 supported):
# This argument is optionnal (A4 will be selected as default)
PAPER_FORMAT="A3"
Run script:
sudo docker run -it --rm --net=host --privileged \
-v /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket \
-e SMTPSERVER_EMAIL=your_email@gmail.com \
-e SMTPSERVER_PASS=your_gmail_app_secret \
-e EMAIL_RECIPIENTS="email1@gmail.com;email2@gmail.com" \
-e SCAN_MODE=ADF \
-e SUBJECT="Test scan" \
-e BATCH_TOTAL=1 \
-e PAPER_FORMAT="A4" \
-e PRINTER_ADDRESS="hpaio:/net/OfficeJet_Pro_7740_series?ip=192.168.14.113" \
matbgn/auto_scan
-
Be sure to have those packages installed:
sudo apt-get install -y sane-utils libsane-hpaio \ imagemagick ocrmypdf \ tesseract-ocr-fra tesseract-ocr-deu
-
(Run venv &) Install requirements with:
pip install -r requirements.txt
-
Grab scan_pdf depedency
cd auto-scan git clone --depth=1 https://github.com/matbgn/scan-pdf.git cd ..
-
Put your variables in a .env file with below possibilities:
SMTPSERVER_EMAIL=your_email@gmail.com SMTPSERVER_PASS=your_gmail_app_secret SMTPSERVER_HOST=smtp.gmail.com EMAIL_RECIPIENTS=""email1@gmail.com;email2@gmail.com" SCAN_MODE=Flatbed SUBJECT="Test scan" BATCH_TOTAL=1 PAPER_FORMAT="A4" PRINTER_ADDRESS="hpaio:/net/OfficeJet_Pro_7740_series?ip=192.168.14.113" PAPERLESS_LOCATION="~/paperless/consume"
-
Then run script directly with:
python ./main.py
In case of following error proceed as below:
|
Error during converting jpg to pdf convert-im6.q16: attempt to perform an operation not allowed by the security policy `PDF' @ error/constitute.c/IsCoderAuthorized/408. |
As a temporary fix, edit /etc/ImageMagick-6/policy.xml and change the PDF rights down in the document from none to read|write there. Or simply run:
sed -i 's/rights="none" pattern="PDF"/rights="read|write" pattern="PDF"/' /etc/ImageMagick-6/policy.xml
Not sure about the implications, but at least it allows getting things done.
Follow the first two points of the above development mode on your server (or just locally).
Then simply run (on any port wanted):
waitress-serve --port=4040 app:app &
As a service:
# /etc/systemd/system/auto_scan.service
[Unit]
Description=Auto scan server
After=network.target
[Service]
Type=simple
User=USER_TO_BE_USED
WorkingDirectory=/home/USER_TO_BE_USED/auto-scan
ExecStart=/path/to/venv/bin/waitress-serve --port=4040 app:app
Restart=always
[Install]
WantedBy=multi-user.target