Most of us, when we start dabbling with Excel macros, probably take the following steps:
- Record code
- Begin to make minor edits to the code
- Eventually learn about loops and start using them everywhere
- Learn that loops are "expensive" and cost time to do so start looking for fast ways
- Discover "screenupdating=false" to cut loop execution time a bit
- Start using Excel's built in functions to avoid loops and cut time significantly
This is pretty much the way my leaning went anyway. I now try to avoid loops wherever possible, and try to increase execution speed as I just hate waiting for my stuff to complete.
I work with a lot of ASCII files dumped out of our Property Management System, often pulling them into Excel to work with. One example of this is our Trial Balance. Because of the way the vendor built the software, it creates a Trial Balance with every account repeated for every department, even if our chart of accounts doesn't have that combination of account/department codes. This means that when I print it to file, my trial balance is over 15,700 lines long.
I actually use this report in a few different ways, and sometimes can need to update it several times in one day. Naturally I wrote a macro to go through the file, strip out garbage lines, reformat the entry so it can be used as I need it and a couple of other things. Like a total neophyte, I opened it in Excel, then proceeded to loop through every line. The end result, based on a single timer test, was 132.5 seconds to complete the process. 2.2 minutes of staring at the screen waiting for it to complete each time I ran it.
At the MVP summit, I had a discussion with Damon Longworth about doing this kind of task using Autofilters, but he recommended streaming it in from the text file and checking the data before it ever hit Excel. I decided to give it a try.
Despite re-writing to do the task (which I had to do anyway due to a change), I actually needed more case statements and if statements to work outside Excel than I did if I pulled the data in. Regardless, the execution time cut down from over 2 minutes to 25.5 seconds. Pretty darn cool. 🙂
For reference, I started with (what I believe was) Bernie Dietrick's large file import code, and modified it to do my stuff. His code, which I originally worked with is shown below:
Sub LargeFileImport(Filename As String)
'Bernie Deitrick's code for opening very large text files in Excel
Dim ResultStr As String
Dim FileNum As Integer
Dim Counter As Double
'Get Next Available File Handle Number
FileNum = FreeFile()
'Open Text File For Input
Open Filename For Input As #FileNum
'Turn Screen Updating Off
Application.ScreenUpdating = False
'Create A New WorkBook With One Worksheet In It
'Set The Counter to 1
Counter = 1
'Loop Until the End Of File Is Reached
Do While Seek(FileNum) <= LOF(FileNum) 'Display Importing Row Number On Status Bar Application.StatusBar = "Importing Row " & Counter & " of text file " & Filename 'Store One Line Of Text From File To Variable Line Input #FileNum, ResultStr 'Store Variable Data Into Active Cell If Left(ResultStr, 1) = "=" Then ActiveCell.Value = "'" & ResultStr Else ActiveCell.Value = ResultStr End If If ActiveCell.Row = Activesheet.Rows.Count Then 'If On The Last Row Then Add A New Sheet ActiveWorkbook.Sheets.Add Else 'If Not The Last Row Then Go One Cell Down ActiveCell.Offset(1, 0).Select End If 'Increment the Counter By 1 Counter = Counter + 1 'Start Again At Top Of 'Do While' Statement Loop 'Close The Open Text File Close End Sub[/vb]
2 thoughts on “Stripping data from an imported file”
You really need to get more into regular expressions 🙂 I take a 100,000 line account dump from our GL system, parse the thing with regexp, then summarize in excel and split into relevant text files in about 10 seconds. My monthly 450,000+ line data dump does take about 25 seconds to do, but there are too many calculations in there to speed it up anymore. You know how to reach me if you want to know more 🙂
You're right, Matt. I should get into regExp. As soon as I have the time to re-write again, I'll call you up. 🙂