Squeaky Clean 0.10 Alpha
Squeaky Clean was written with HTML exported from Office in mind.
|
Squeaky Clean was written with HTML exported from Office in mind. It rips out all the classes, styles, strange XML and conditionals. Thus it doesn't look the same afterwards but at least the markup is nice and clean.
This makes it easy to go back in and reimplement the styles using sensible CSS. Alternatively you can edit this file to stop style and class attributes being removed.
Documents will be converted into utf8 from whatever charset they started in. Installing iconv will increase the charset support to include multi-byte charsets, like east asian and arabic charsets. By default most single byte charsets and unicode are supported.
This program uses an XML parser to read the HTML. This means that if the source file is highly non XML compliant it will fail to parse. I have no interest in writing a robust HTML parser, so you'll either have to fix the file or use some other tool. The parser is not too strict about quotes and things. You can even tell it not to look for child tags by adding tags to the "nochild" section of the config file 'Clean.xml'.
The attributes and elements that get deleted are configurable via the file 'Clean.xml' distributed with the app. It works for the one file I needed clean, but I expect it'll need work to be useful for the general case. Please read the comments in that file and edit if neccessary for your own files. Generally useful changes should be sent back to me for inclusion in future versions.
Future versions may dig into the CSS and clean it up instead of just deleting it all. But that would require more invasive parsing, anyway this is just an alpha release.
tags
clean xml for the future versions file clean the file you can byte charsets read the and clean

Download Squeaky Clean 0.10 Alpha
Download Squeaky Clean 0.10 Alpha
Similar software
Squeaky Clean 0.10 Alpha
Matthew Allen
Squeaky Clean was written with HTML exported from Office in mind.
Squeaky Clean 1.0.1
Kiwi Enterprises
Squeaky Clean application locks your mouse pointer away allowing you to clean your mouse or trackball.
MenuXP 1.0
ComponentSpot
CMenuXP is a small set of MFC classes that allow developers to add various graphical user interface elements with an Office XP look to their MFC apps.
Strangers in Strange Lands 1.0
San Diego Screen Savers
Strangers in Strange Lands is a beautiful and colorful screensaver that contains many interesting images.
Mind Pad 3.1
AKS-Labs
Mind Pad is a .
Link Ripper 7.01
A novel concept
Link Ripper is a simple, yet efficient, and powerful software application that can organize your hyperlinks.
Office Diary 2006 3.25
NewEasy Technology
Office Diary is a software that allows you add,edit and manage your diary.
Word to HTML converter 1.6.1
Maluke Software
Word to HTML converter will easily convert MS Word documents to HTML in just two clicks.
Denizens of Deep Space 1.0
Pixel Paradox
See denizens of 32 of the inhabited worlds in the outer regions of the galaxy.
OxMath Classes 1.0
Chaotic Box
OxMath Classes - Standalone 3D math classes (vector, quaternion, etc.