Note
A lot of this information is deprecated now. Please see my PyPDFOCR package for how I do everything described in this article
I've tried to go paperless for a long time now, but the overhead of starting my laptop, scanning, importing, and the filing in a folder always made me go back to just keeping stacks of paper around. But now I've finally found a solution that works well enough that I haven't abandoned it after a few months. The key is a stand-alone scanner that does OCR, converts to PDF, and then uploads to Evernote automatically. Details below:
Hardware required
- Fujitsu ScanSnap 1500 - This was the game-changer; it's a little pricey but after having gone through a bunch of inferior flat-bed and Neatworks scanners, this stand-alone scanner really makes things simple.
- Dedicated PC running on your network - You could just use your laptop, but I have a bunch of different desktops running file servers etc that makes things simpler
Software required
- Evernote - I'm a big fan of this service, and use it for all my notes/receipts/documents/lists. Get the premium service, it's definitely worth it!
- Abby FineReader for OCR (this ships with the Fujitsu ScanSnap)
- Download a program that can watch a folder for file changes and run a script. Since the server I'm using at home runs Windows, I use Watch 4 Folder, but this should be much easier on Mac OS X.
Workflow Scripts
- Setup up a profile on your ScanSnap image scanner that scans to Abby FineReader (OCR that comes with the software) and does a searchable PDF, that writes the file to a specific folder which I'll refer to as "Incoming". The raw scan will first show up as "YYYY_MM_DD_HH_MM_SS.pdf", and once Abby finishes the OCR in the background, it will replace it with "YYYY_MM_DD_HH_MM_SS_OCR.pdf"
- Setup the following batch file "move.bat". This will watch the "Incoming" folder for any file ending with "_OCR.pdf", wait 5 minutes, and then copy it to a folder called "To evernote"
[ccw lang="dos" width="100%" strict="true"]
set noext=%file:~1,-9%
set ocr="%noext%_OCR.pdf"
IF EXIST %ocr% (
echo "Found %ocr%! Waiting 5 minutes before doing anything"
PING 1.1.1.1 -n 1 -w 300000 >NUL
move %ocr% "c:\users\virantha\Documents\ScanSnap\To evernote"
)
exit
[/cc]
- Since we want multiple instances of this batch file running, we need to create another batch file "start.bat" to invoke this as a process:
- [cc lang="dos"] start c:\Users\virantha\Documents\move.bat %1 [/cc]
- Configure Watch 4 Folder to run at startup minimized, and monitor your "Incoming" folder for any changes, and then execute "start.bat"
- Configure Evernote "Tools -> Import Folders" and add your "To evernote" directory to the list of folders Evernote watches for file imports.
- You're done! One press on your scanner, and your OCR'ed PDF documents will arrive in your Evernote default notebook in a few minutes. You can then use evernote search to find whatever you need at a later date even if you don't bother to file these new notes.