Last edit: 05-12-13 Graham Wideman
|HTMLTagClean: Fixing Office's Baroque HTML
Article created: 2005-03-04
If you've ever wanted to paste data from Excel to an HTML page, say in FrontPage, you've confronted the problem that Excel's HTML is heavily laden with gratuitous formatting.
In theory there's an argument that this is so that the HTML can be read back into Excel with enough detail to recreate the original... but as often as not it leaves you tearing your hair out trying to get rid of all that ^&%$# formatting. And FrontPage's "Remove Formatting" is not much help here, as at least some of the formatting is in the form of styles and attributes arguments that this function ignores.
It looks simple in Excel:
... but once copied and pasted into Frontpage, the html is hideous:
<table x:str border="0" cellpadding="0" cellspacing="0"
width="192" style="border-collapse: collapse;width:144pt" id="table2">
.... and that's just the first couple of rows! It's faster to just retype it from scratch. And what Excel saves as an HTML file is similar.
In a few quick steps, HTMLTagClean helps get rid of all attributes from selected tags, and can remove other tags completely:
|Launch HTMLTagClean||... if it's not already running. (In Windows Explorer, double-click on HTMLTagClean.exe, or use shortcut.)|
|Copy the HTML from the HTML editor.||
Select and copy the chunk of HTML (source) that you want to fix. Hint: If you've pasted an Excel
table into Frontpage, then select it while in Design view (because that's easy), then flip to
Code view, then copy it.
...or for a quick test, just hit the (Example) button in HTMLTagClean
|Paste into HTMLTagClean||
Press the Clear button if there's existing text. Press the Paste button:
Clean attribs from tags
... or ...
a) In the "2. Remove Attribs from Tags" area, enter a list of tags into the slot next to the
"From these tags:" button.
b) Decide whether you want to remove or retain particular attribs, and set the Retain/Remove radio button and attribs slot accordingly.
c) Hit the "From these tags" button.
|Remove tags themselves||
a) In the "3. Remove Tags" area, enter the tags to be removed.
b) Hit the "Remove these tags" button.
... aaahhhh! What a relief!
|Select, Copy, Paster||Use the Select All...Copy button to copy the cleaned HTML to the clipboard. Then paste into Frontpage in place of the original HTML|
|Format||Now you can format the minimal HTML the way you want to.|
|HTMLTagClean_111.zip||1.1.1||2005-12-03||Added Retain/Remove feature. (Thanks to Lee Tagg for the
Added Example button
Revised HTML buffer to avoid truncating really long lines (over 1000 characters)
|HTMLTagClean_101.zip||1.0.1||2005-08-27||Minor adjustments to layout and memo behavior|
|Download||.... from this page|
Using WinXP Windows Explorer, or WinZip. Copy contents to some convenient folder, perhaps:
|Shortcuts||Optionally create a shortcut on desktop or in Start menu in the usual way, by Alt-Dragging the executable file from Windows Explorer.|