Wednesday, 12 July, 2006

ActionScript 3 Regular Expressions

I've been playing around with the new Flash 9, learning a bit about programming in ActionScript 3.  This new version of ActionScript is a real programming language with a real object model and other things that we've come to expect from environments such as Delphi, Java, and .NET.  I only have a couple days' experience with it, so I don't have a lot to report yet, but we did run into an interesting oddity when splitting a string using regular expressions.

The String class has a function, split(), which will split a string into substrings by dividing it wherever the specified delimiter occurs.  The delimiter parameter is usually a string or a regular expression.  So, these two statements should give the same results:

var words:Array = str.split("\r\n");    // delimiter parameter is a string
var words:Array = str.split(/\r\n/);    // delimiter parameter is a regular expression

We use the above code to split a loaded text file (loaded into a single string) into an array of lines.  The two versions do in fact give the same results, but the regular expression version takes much, much longer.  We loaded a file that contains 400 words and both statements return immediately.  With a file of 10,000 lines, the string version is still instantaneous, but the regular expression version takes about five seconds.  Even with 180,000 lines, the string version is immediate.  We gave up on the regular expression version after over five minutes.

A .NET program that loads a file and splits the lines using regular expressions is instantaneous, so I don't think that I'm expecting too much by wanting the Flash version to perform similarly.

I poked around with the regular expression options for a while and got nowhere.  What concerns me isn't so much the speed, but that the time required does not increase linearly with the number of items in the list.  Somebody at Adobe needs to take a look at their regular expression matching algorithm.