Brother, why art thou?

My new years theme is "Organization". I'm a notoriously messy person (this blog is a fine example).

I also have a bad habit of losing paper and being entirely unable to find it when I need it. Further, when I DO find it, I rarely have the time to act on the thing I actually need to do with it.

So, I'm scanning every single piece of important mail and putting it in my document management system. That way I can find it when I need it, even months later, while not having to be colocated with the actual piece of paper.

This started with three main pieces of required technology.

1) A document scanner

2) A document management system

3) Somewhere to put the actual documents.

Because I'm disorganized, and too lazy to reorganize the above, we'll be going out of order.

2) Document Management System

I'm using Paperless-NGX. I like that it integrates with LDAP, Authelia, and has a decent web interface that translates to mobile.

I add my tags as necessary, correspondents, document types, and it learns what type of common documents I've got coming in. Eventually, I'll get more software in there to take things like due dates and add them to my calendar along with links to the documents directly.

3) Somewhere to put the actual documents

Lets put it on the NAS! Its easy, its there, its backed up, and its got a huge amount of space. Create a new share, point paperless to it, and point the document scanner to it. Sprinkle with permissions, and roast on 350 until done.

1) A document scanner.

For this, I upgrade my trusty Brother MFC-9330CDW to the MFC-9340CDW. The only upgrade there is the 9340W has a duplex Automatic Document Feed (ADF). Meaning, I put in a stack of paper, and it scans all the pages, both sides.

All seems well, except one thing. The shortcuts on the front of the scanner don't remember that you want to scan both sides of the paper every single time. 6 "button" presses on a resistive touch screen for each document is not how I want to spend my life.

A solution of solutions

I could collate with a separator page and make Paperless-NGX do the document splitting, but I want less contact with my papers, not more. And then I'd have to store the separator pages, or print extras as needed. Nah, too much effort.

The only official way to scan both sides of the paper from the ADF semi-automatically, is using the Brother software (Windows/Mac only).

Fortunately, their software is dumb, or their scanner is standards compliant. Regardless of the truth, it is supported by scanimage. https://linux.die.net/man/1/scanimage.

Now that I know I can scan things from the command line, I need a way of doing so automatically. Enter: https://github.com/PhilippMundhenk/BrotherScannerDocker.

Since I've already got a substantial docker cluster floating around, I tend to dockerize whatever I can to make my life easier to manage. This image is a fancy wrapper around scanimage and provides a REST API.

There do need to be some tweaks to the scanning scripts to get everything working properly, someone familiar with bash scripting would be able to make heads or tails of it, but I've detailed my script below.

Since the goal is to improve SAF and reduce the time spent in front of the keyboard, I need a way of triggering the REST API.

HomeAssistant is my home automation "controller" of choice. Its the ONE app I allow to have access to everything, and the one app the family can be guaranteed to work with the smart stuff. Fortunately, it in its dockerized form, can call a REST API.

Now, to bridge the gap between HomeAssistant and meatspace, we leverage my zigbee2mqtto container to pick up a zigbee remote that came with a bunch of zigbee bulbs from Home Depot I bought on firesale many years ago. The remote (Leedarson 4-Key Remote Controller) has been supported for some time now, and I use them ALL over my house for various things.

So the button on the remote gets a new label from my label maker. HomeAssistant gets a set of automations, a new set of services, and we're off to the races.

The Final Rube Goldberg

1) I insert paper into the ADF.

2) I press the appropriate button on the remote.

3) The remote sends that signal to zigbee2mqtt.

4) zigbee2mqtt sends it to my mqtt broker.

5) HomeAssistant picks up the message from the broker.

6) Homeassistant kicks off the automation associated with that button.

7) The automation calls the service associated with that REST endpoint.

8) The BrotherScannerDocker is commanded to Scan Both Sides.

9) The scan is then dropped into my network share folder.

10) Paperless-NGX automatically scans that folder every few minutes (because the container doesn't have inotify )

11) The scan is OCR'd, reoriented, and (hopefully) appropriately tagged.

12) I can see the paper on my phone or computer at a time and place of my choosing.

Why?

Because Brother won't let us save scanning preferences on the on-screen shortcuts. That one change would eliminate steps 1-8.

Brother, I'm begging you. Please make these firmware changes. I guarantee I'm not the only one with these issues.

No, Why the scanning thing?

Because I'm wildly disorganized in meatspace. I've tried to get better at it, but its ultimately something that doesn't get done, and I end up with an inevitable Large Pile of papers placed precariously. This doesn't solve the Large Pile problem, it just makes it so I can find what I need when I finally have a chance to call That Office which is only open 9:30am-4pm with a 12:15pm-1:45pm lunch break.

PS. Have some documentation:

HomeAssistant Automation YAML

alias: Printer Remote Scan Option 1
description: ""
trigger:
  - platform: device
    domain: mqtt
    device_id: $DEVICE_ID
    type: action
    subtype: "on"
    discovery_id: $DISCOVERY_ID action_on
  - platform: device
    domain: mqtt
    device_id: $DEVICE_ID
    type: action
    subtype: "off"
    discovery_id: $DISCOVERY_ID action_off
condition: []
action:
  - service: rest_command.option1_scan
    metadata: {}
    data: {}
  - service: persistent_notification.create
    metadata: {}
    data:
      message: Option 1 Scan Button Pressed
mode: single

Prefer configuration through UI, the flow is much easier there)

HomeAssistant REST-API Service

rest_command:
 insurance_scan:
   url: "http://$BrotherScannerDocker_IP:$BrotherScannerDocker_PORT/scan.php?target=$TARGET" #One of "email", "file", "ocr", "image"
   method: get

In my case the _IP is my docker cluster, and the _PORT is what I've mapped the container's port to.

BrotherScannerDocker docker-compose.yaml excerpt

  brother-scanner:
    image: ghcr.io/philippmundhenk/brotherscannerdocker
    volumes:
       - $NAS/RAW/:/scans
       - $NAS/brotherscannerdocker/script/:/opt/brother/scanner/brscan-skey/script/
    ports: 
       - 54925:54925/udp
       - 54921:54921/udp
       - 33355:33355
       - 161:161/udp
    environment: 
       - NAME=Scanner
       - MODEL=MFC-9340CDW
       - IPADDRESS=$PRINTER_IP_ADDRESS
       - TZ=$TZ
       - HOST_IPADDRESS=$DOCKER_HOST_IP_ADDRESS
       - WEBSERVER=true
       - PORT=33355
       - RENAME_GUI_SCANTOFILE="Option 1"
       - RENAME_GUI_SCANTOEMAIL="Option 2"
       - RENAME_GUI_SCANTOIMAGE="Option 3"
       - RENAME_GUI_SCANTOOCR="Option 4"
       - UID=1000
       - GID=1000

You'll need to figure out your own storage paths here. Make sure there's plenty of space and that Paperless-NGX can reach it.

BrotherScannerDocker Scan Script (all 4 options of mine are the same, just dump the file in different folders)

#!/bin/bash
# $1 = scanner device
# $2 = friendly name

#if [[ $RESOLUTION ]]; then
#  resolution=$RESOLUTION
#else
 resolution=300
#fi

if [ "$USE_JPEG_COMPRESSION" = "true" ]; then
    compression_flag="-compress JPEG -quality 80"
else
    compression_flag=""
fi

device=$1
cd /scans
date=$(date +"%Y-%m-%d-%H_%m_%S")
filename_base=$OPTION_1-$date
output_file=$filename_base.pdf

scanimage -l 0 -t 0 -x 215 -y 297 --source='Automatic Document Feeder(centrally aligned,Duplex)' --format=pdf --resolution=300 --batch=/scans/$OPTION_1/$output_file