Section: Tech

16 Aug

Reading and writing Microsoft Word docx files with Python

skip_better python

I've been wanting to script simple text scanning and substitution in Microsoft Word documents for a while now, and after a little digging, it turns out, it's fairly straight-forward to read and edit .docx (OpenXML) or the ECMA-376 original standard, and now under ISO as ISO/IEC 29500. Although I …


20 Apr

Python auto sort of OCR'ed PDFs

I'd previously written about how I was using a Fujitsu ScanSnap 1500 to reduce paper clutter and move to a paperless workflow at home.  So far, this system has been working great for me, with every scanned document getting OCR'ed and uploaded to my default Evernote notebook as a searchable …


01 Mar

Better VIM for Python

As someone who spends a large fraction of their day editing text and code, I've often thought about just investing a few days learning the more advanced time-saving features of my text-editor, VIM. Unfortunately, "a few days" just doesn't happen, and the few times I did learn some new tips …


02 Nov

Class-based decorators in Python

I recently started using decorators in python (2.7) to clean up some existing code, and one big hurdle I had to surmount was the dearth of accurate information on using class-based decorators. The few examples I found were quite buggy, and it seemed that most people did not use …


25 Oct

Getting rid of paper clutter

Note

A lot of this information is deprecated now. Please see my PyPDFOCR package for how I do everything described in this article

skip_better Picture

I've tried to go paperless for a long time now, but the overhead of starting my laptop, scanning, importing, and the filing in a folder always made …

© Virantha Ekanayake. Built using Pelican. Modified svbhack theme, based on theme by Carey Metcalfe