It’s an ancient (not to say obsolete) format, so you’d expect some…
- mht-rip-0.8.c: 2010, in C.
- mhtconv: libmht-0.1, 2009, in C.
- spackager-0.5.5 (GitHub): 2012-04, and in Python!
But, anyway, I ended up writing my own.
- $ git init
- $ git add mhtifier.py README.md
- $ git commit -m “Created repo, committing code and initial doc.”
- $ git remote add origin https://github.com/Modified/MHTifier.git
- $ git push
(Actually, procedure was much uglier, all this non-fast-forward annoyance…)
Enhancements or possible bugs, that I’ve no time to perfect:
- Cleanest would’ve been to use stdin/out, but turned out inconvenient, annoying even, so added command line options.
- Python’s stdlib module’s performance (premature optimization?):
email.message_from_bytes(mht.read()) # Parser is "conducive to incremental parsing of email messages, such as would be necessary when reading the text of an email message from a source that can block", so I guess it's more efficient to have it read stdin directly, rather than buffering.
- Encodings (ascii, UTF-8) and de/coding was painful, and probably still buggy.
- Base64 encoded binaries: my editor, Geany, suffocates, I think, when wrapping these long lines?
- Verify index.html is present.