I had an interesting case today where I needed to strip all that ugly Office HTML out of an html document. I found this tool for Office 2000 that supposedly did just that â strip all that ugly âMSOâ stuff from the file. There was a problem though, this .exe installer wouldnât open on my machine, because I donât have Office 2000 installed, only Office 2003.
I found that this installer supposedly contains a utility called âMSFilter.exeâ that will run as a stand alone .exe, and would batch convert html files, stripping the office XML. Some sites mentioned that if you install this tool on a machine with Office 2000, you could just copy the MSFilter.exe file to the computer without office 2000, and use the utility.
I donât have such a system, but I figured out a way to extract these files from an installer. Iâm not sure how often this will work, but the method I used here worked great to extract just the utility I needed from this installer.
Hereâs what I did after downloading the MsoHtmf.exe file from MS.
Step 1: Extract CAB files from the .EXE installer file
I opened the .exe file in Visual Studio and peeked at the files resources. I noticed a binary node called âCABINET.â Right-Clicking on this node allowed me to export the contents to disk as a âBINâ file. I changed the extension to to â.CABâ and was able to open the file with WinZip
.
When I did this, I was left with two files inside of this CAB file, âLuncher2.exeâ and âmsohtmf2.msiâ. Running this .MSI file gives the same âCannot Installâ error as before, since I didnât have Office 2000 installed. I was sortof back at square 1. I needed to open an MSI file and extract the contents.
Step 2: Extract the CAB contents from the MSI file.
It turns out that thereâs a little utility deep inside the Windows Installer SDK that will allow you to extract files from MSI installers called âOrcas.â I opened Orcas.exe and extract the âCabsâ item in the msi file by selecting the proper item and choosing âTables..Export Tablesâ
This exported the contents of the cab file to disk. In this case the Cabs node contained all the files I needed to run this utility.
I extracted these files to disk, and could run the batch remove all that ugly code from my office HTML documents. I wanted to blog about this, because itâs useful to sometimes just extract one tool from an entire installer. I donât recommend that you go around hacking installer files, but itâs nice to know how to do it, when you need to.