Read Apache log files line by line: fileStream.readLine()

18 January 2011

I wanted to read access log files by an Adobe Air application. As these files can get very big, there is no way to load the entire file into memory. Unfortunately, FileStream does not help much because there is no readLine() method like in Java’s BufferReader.readLine().

Googling reveals a nice implementation of Sandeep Gupta: FileStreamWithLineReader. While it works fine for pure ASCII files, it’s bogus for files with multibytes characters like öäü. Also, I’m no fan of extending classes and prefer encapsulation of the FileStream. FileStreamLineReader is slower compared to FileStreamWithLineReader but it works for all characters.

Here’s an example how to use the class:

var accessLog:File = new File("access.log")
var logStream:FileStreamWithLineReader = new FileStreamWithLineReader()
logStream.open(accessLog, FileMode.READ)
while(logStream.lineAvailable) {
	var log:String = logStream.readUTFLine()
}

package 
{
import flash.filesystem.File;
import flash.filesystem.FileStream;
import flash.utils.ByteArray;


public class FileStreamLineReader {
	
	private var _buffer:ByteArray
	private var _bufferSize:uint
	private var _fileStream:FileStream
	private var _lastLineEndPosition:int = 0
	private var _lineAvailable:Boolean = true
	
	function FileStreamLineReader(bufferSize:uint=512) {
		_bufferSize = bufferSize
	}
	
	public function open(file:File, fileMode:String):void {
		_buffer = new ByteArray()
		_fileStream = new FileStream()
		_fileStream.open(file, fileMode)
	}
		
	public function get lineAvailable():Boolean {
		return _lineAvailable
	}
	
	public function readUTFLine():String {
		return readMultiByteLine("utf-8")
	}
	
	public function readMultiByteLine(charSet:String):String {
		var toReturn:String = readLine(charSet)
		
		// the following check is a fix when on windows the buffer reads between the values of
		// 13 and 10, which are used to indicate the end of line
		if(toReturn != null && toReturn.charCodeAt(toReturn.length - 1) == 13) {
			return toReturn.substr(0, toReturn.length - 1)
		}
		
		return toReturn
	}
	
	private function readLine(charSet:String):String {
		const initialReadPosition:Number = _lastLineEndPosition
		_fileStream.position = initialReadPosition
		var adaptedBufferSize:uint = _bufferSize
		while (true) {
			var bytesToRead:uint = Math.min(adaptedBufferSize, _fileStream.bytesAvailable)
			_fileStream.readBytes(_buffer, 0, bytesToRead)
			var currentReadString:String = _buffer.readMultiByte(bytesToRead, charSet)
			var index:int = currentReadString.indexOf('\n')
			if(index != -1) {
				currentReadString = currentReadString.substr(0, index - 1)
				_buffer.clear()
				_buffer.writeMultiByte(currentReadString, charSet)
				_lastLineEndPosition += _buffer.length + 2
				_buffer.clear()
				return currentReadString
			} else {
				_buffer.clear()
				if(_fileStream.bytesAvailable == 0) {
					_lineAvailable = false
					return currentReadString
				} else {
					_fileStream.position = initialReadPosition
					adaptedBufferSize *= 2
				}
			}
		}
		throw new Error("could not find line")
	}
	
	public function close():void {
		_fileStream.close()
	}

}
}


Mac vs PC

22 June 2010

This is just to good to miss on a bloc called “I don’t like computers.”:


Which Flash Fonts you have.

18 February 2010

I was struggling last week with the font “Tahoma” because italic was not shown properly. It turned out that Tahoma cannot display italic type. So what were my alternatives?

And here we go, the list of all fonts on your computer. If you see an error, you run into a Flash Player 10.0 bug. Just update to Flash Player 10.1.


ActionScript/Flex Dependency Injection performance

1 February 2010

I’m currently sneaking around the dependency injection (DI) frameworks Dawn, Swiz and Parsley. None of which I have ever used in production, so take my judgment with care. At first look, the use of Dawn is simplest and most concise and somewhat similar to Google’s Guice which I like. In Swiz, I don’t like the global state used in event handling. Parsley is well documented but a framework that is configurable by XML makes me very suspicious.

Tests

DI depends on reflection on classes which is mainly done with describeType() in Flash. I had in mind that it was a very slow function, but I didn’t know for sure. So I wrote some code to estimate the runtime performance of DI frameworks:

Find metatdata tag [Inject] in different classes:
5050 μs for UIComponent
5150 μs for PersonDetails
700 μs for PersonDetails with DescribeTypeCache
30 μs for SimplestClass
360 μs for describeType XML of UIComponent

2 us to dispatch and handle an event

0.2 μs to call getQualifiedClassName(UIComponent)
1.6 μs to call getDefinitionByName(UIComponent)

0.3 μs to call uiComponent['drawFocus']
0.2 μs to call uiComponent.hasOwnProperty('maxHeight')
0.7 μs to call uiComponent['maxHeight']

9 μs to call Log.getLogger('mx.core.UIComponent')
12 μs to call CreateLogger.withClassNameOf(VBox)

μs = microseconds

You can evaluate the Flash Player performance yourself with the app. The code is available with a right click on the app.

Results

describeType() is really slow. It takes more than 5ms on UIComponent. What surprised me positively is the speed of getQualifiedClassName(). I flipped all logger creation to using getQualifiedClassName() with the helper class CreateLogger.as since.

Conclusion

Dependency injection frameworks make only sense in big projects. Let’s assume a project with 200 different classes that have injected objects, the start up time will increase by 1 second. Knowing that Flex applications don’t fire up quickly, 1 second is surely too long. For BlocStac, I will use DI only in side projects and hope that Flash Player offers faster reflection functions soon.


Flex component EditImage rewritten

24 January 2010

Past releases of the free EditImage component have seen some interest. So I thought, I could also share the new version which is written from scratch. EditImage suits the needs of the BlocStac wiki, but maybe you can tweak it for your own purpose. Just to make it clear, EditImage is for Flex developers.

The current features are:

  • scale
  • crop
  • rotate
  • undo/redo
  • zooming (not scaling) and panning the image
  • zoom the image to maximum allowed width
  • 3 states: zoomable, editable and only readable
  • import/export

Demo

As before, the source code is under the liberal BSD license and you can find it on Google code. As we use this code internally, expect changes in the source. However, note that we will not tag any release in mercurial. EditImage requires Flex 4 and Flash Player 10.

Have fun using it,

Marc


Pitfalls of loading Flex modules

18 August 2009

I’ve just run into two issues for Flex module loading. Maybe it helps somebody…

Hungry garbage collector

When loading modules, it’s important that you keep a reference to the IModuleInfo instance. I did store them in a Vector and thought everything works that way. What I forgot was a subtle timing issue: I checked the Vector for loaded modules after I pushed a new loading job into Vector. So it could happen that right after pushing the job into Vector, it got thrown into the garbage.

Stale Expires HTTP headers

Another nice one are HTTP caching headers: By mistake, I assigned all Flex modules a fixed Expires HTTP header. It expired this morning and loading modules became a seemly random process: The first module loaded successfully, but the second and third failed. When the same Flash Player instance loaded the module a second time, everything worked well. Well, after some debugging, I found the culprit in the HTTP header. After removing the Expires header, everything runs smoothly again.

Let me know if you have other stories about module loading.


Stand up AGAINST Generics/Concurency in Flash Player

7 August 2009

I’ve just read Joa Ebert’s excellent post “This is an outrage“  about the Flash platform. I totally agree on the big picture but don’t like some language proposals. As my tiny little contribution, I’d like to dissect and reassemble some often heard wishes for the ActionScript language.

Reading about generics and concurrency in Flash Player makes me nervous. I have read many wishes about those two features but never found a very precise description of what they really want. I fear that many think of language features found  in C++, C# or Java. The problem I have with those languages is that the language complexity reduces massively developer efficiency and effectiveness. Why do so many developers flee to Ruby, Python, Scala  or F#? How can ThoughtWorks state that they are  roughly two times faster in developing with a language that has no generics but the feature is still high on the wish list for ActionScript?

Generics

Please correct me if I’m wrong but you could not do more with generics in ActionScript. So we are talking about a usability feature for the developer, right? Instead of endless castings and writing tests, the compiler could do some of this work. But do generics really make you write and much more importantly read ActionScript code faster? If I think of Java 1.6 generics, my answer is certainly no and that’s the reasons why I’m against generics as known in Java.

However, I see some areas where type safety increases developer productivity. Vector is a great example and like Joa, I’d like to see some more native collections in Flash Player that use generics. But besides those exceptions, I don’t need generics.

Concurrency

I see two feature requests in concurrency: The first one is run time performance. The number of CPU cores is going to increase steadily and developers can only leverage several cores in PixelBender in Flash Player 10.  I think of PixelBender as a machine that launches as many thread as there are pixels in the output. Everything is finally joined when jumping back to ActionScript. So basically, we have already concurrency for a dedicated purpose. Sure, Flash Player needs to be able to leverage several cores (as it is already today) but I’d prefer if the VM could take care of concurrency. While writing code, I don’t want to think in CPU cores but in my business domain. Though I must admit that I don’t know anything about the feasibility of such an approach.

The second feature in concurrency are parallel tasks in an application. For example, saving data to a server and updating the display could run in parallel. The developer needs to control these parallel tasks. I know Java reasonably well and I fought trough Java Concurrency in Practice which I recommend. However, if it takes a book of 400 pages to know the basics of concurrency, the feature is never worth the pain of learning it. I would argue that concurrency introduces a level of complexity that nobody can master in a mid-sized project anymore. You just happen to fix known issues but you can not validate it. Testing concurrency is not even in its infancy.

That’s the reason why I plea for a much simpler solution compared to the implementation in Java. Maybe having a fixed number of e.g. 4 threads makes it a lot easier. Or maybe making all accesses synchronized per default and defining the exceptions is something to consider; I don’t know. I don’t pretend to know a great solution, I only know that I don’t want any solution I know today.

Bonus feature: Overloading

While allowing to apply all standard namespaces to constructors reduces complexity in ActionScript, overloading certainly adds more complexity. What’s the benefit instead of handling the different types in the constructor or method itself like in e.g. Python? If we had powerful pattern matchers like Scala has, I could easily live without overloading. Pattern matcher could also play a major role in solving “generics task”.

About Complexity

I find the comment of Lee Brimelow brings it to the point: “…there is also a huge segment of the community who is still really struggling with moving to AS3 as it is now“. For example, I would argue that probably half of all Flex developers are not aware that they create a potential memory leak when referencing objects in an array. They don’t care and they shouldn’t. Adding those features mentioned above will make even more people write buggy code.

Better and faster results

What makes me faster in reading and writing is rather for example type inference and tweaking the functional programming approaches. Which example do you understand faster?


//ActionScript 3
var vw:Car = new Car()
var cars:Vector.<Car> = new Vector.<Car>
cars.push(vw)
readyToDrive = cars.every(function(car:*, index:int, arr:Array):Boolean {
return Car(car).hasEnoughOil
})

or

//ActionScript 2010
vw = new Car()
cars = new Vector.<Car>
cars.push(vw)
readyToDrive = cars.every(return car.hasEnoughOil)

Some more spontaneous thoughts for language improvements:

  • Like in Dictionary, allow weak references in many places.
  • I’m not sure about adding [Mixin]. Haven’t thought through it yet.
  • {} should become in most situations optional as the semicolon is today.
  • If you care about raw performance, use Alchemy, write bytecode with the help of the excellent haxe project, launch a job for PixelBender or pray that the compiler gets more love by Adobe. It’s also clear that AVM2 needs to speed up as well. But don’t clutter ActionScript with pure performance features. After all, ActionScript is just a layer above byte code that should increase developer productivity (performance improvements from the compiler seem too far away).
  • (Sorry, this is my personal, pretty much hopeless fight :) What the community could already do today is dropping this silly ;;;;;;;;;; at the end of each line. It’s just visual clutter and about as useful as adding “//end of line” at each line end.

Finally

My point is that ActionScript should not blindly chase any other language. Adobe should only add language features if they either increase developer productivity or enable new possibilities for the Flash Platform including run time performance. And of course, only after a thorough discussion in the community before implementing.

But at the end of the day, I preferred to use Python or Ruby to program Flash and leave the stalled ECMA standard behind. Hey, maybe Adobe pops up with an equally solid APython or ARuby compiler at MAX?


Mocking in Flex/ActionScript has arrived: mockito-flex

8 July 2009

The announcement of Google Chrome OS is certainly a big bang in the IT world. However, this didn’t made my day but the public release of mockito-flex. It’s been quite painful to create mock objects up to now in Flex. mock-as3 was best until I read on InfoQ about mockito-flex. It’s based on asmock but the syntax feels very much like mockito for Java (which is in my opinion the best mock library in Java). The documentation is still sparse, so below are all the things I need to know:


given(mockie.foo(10)).willReturn("10")
given(mockie.foo(10)).willThrow(new Error("oups"))
given(mockie.foo()).will(new GenericAnswer(incrementCounter))
verify().that(array.push("1"))
verify().that(array.pop())
verify().that(mockie.baz(eq("one"), any()))
verify().that(mockie.baz(argThat(new GenericMatcher("two", contains)), eq(10)))
verify(never()).that(testClass.foo())
verify(times(3)).that(testClass.foo()) // atLeast(4), atLeastOnce(), once()


Cheat sheet: Flash Player HTTP talking with HTTPService, URLLoader and friends

17 June 2009

HTTP communication with Flash Player is messy and very uncomfortable: There are a few classes to know, HTTP headers are never available and different bugs in different browser don’t help either. So I’ve started to compile a cheat sheet for Flash Player and started another one for Adobe Air.

Leave a note if you want to contribute to the Google doc.


Migrating to Flex 4 (Gumbo): Neither epic fail nor seamless

10 June 2009

There has been some debate about whether the migration to Flex 4 is an epic fail or a breeze. As often, the truth is somewhere between. Here is my experience with porting BlocStac.com, an application with roughly 90’000 lines of code including tests, 2 libraries and 8 modules plus 2 Air helper applications.

A good start for the migration is an Adobe article by Joan Lafferty. I also appreciate the information in Adobe’s Wiki. What also helps is that the term “gumbo” works nicely for search. What definitely doesn’t work is searching the Adobe’s forum: you can’t search within the current forum and selecting a single forum in advanced search is impossible due to the sheer number of more than 500 forums.

Skinny CSS, fat Skins

All definitions of borders and paddings in CSS are happily ignored in Flex 4. Using the compiler option “theme=path/halo.swc” is not an option as having components with halo skins and spark skins is not acceptable. What I’ve done so far:

  1. Move all padding declartions in CSS files into the layout objects of components.
  2. Write custom skins for all components that require border settings. Because I could not believe that one (after all, it’s quite a lot of work), I asked on the forum. Peter deHaan from Adobe gave a fast and helpful answer. I love this level of support.
  3. global does not work yet, “*” is also of no help. Still investigating this one.
  4. Rename “background-color” to “contentBackgroundColor” (does anybody know the reason for this change?)
  5. Add namespace to CSS

If anything could be called an epic fail, it might be CSS migration. I had to recode pretty much everything defined in CSS and CSS is now pretty sparse. I’m not even sure whether CSS is still worth the concept in Flex 4.

Flex-Mojo, AdvancedDataGrid, etc

Some other issues I hit:

  1. Flex-mojo: VELO does a great job at bringing maven to the Flex world. It took him only a few hours to include Flex 4beta1 in flex-mojo 3.3-SNAPSHOT and fix an issue with unit testing. Though I hope Adobe offers a maven repository some day… vote for SDK-12730!
  2. AdvancedDataGrid: TypeError: Error #1007: Instantiation attempted on a non-constructor. at mx.controls::AdvancedDataGridBaseEx/getSeparator()… brought me to Matt Chotin’s answer that it’s not available in Flex 4beta1 (4.0.0.7219). Au contraire, the only Flex4 version that has a working datavisualization library is Flex 4beta1. It comes bundled with Flash Builder beta1 but the library is not compatible anymore with current builds.

Is it worth the pain?

Not yet. Not having AdvancedDataGrid is a show stopper for BlocStac.com. All the work for CSS is unfortunate but it probably leads to better code (though not sure yet). If it brings Flex really forward, it’s fine. After all, I rather see an evolving eco system that let die out obsolete concepts than e.g. the Java language that drags everything along.


Follow

Get every new post delivered to your Inbox.