next up previous gif
Next: Stage 2: Differentiating Up: File Classifier Previous: File Classifier

Stage 1: Identifying Executable Files

The first stage of the classifier determines if a file is an ITS executable binary. This is a difficult task, since there is no special ``execute'' flag in ITS directories and the executable format varied over time. It is necessary to decode the instructions in the file and to look for a jump instruction which occurs at the end of every executable file.

To properly decode the word file format, one should start at the beginning of the file and work one's way forward. The word file format was implemented using a small finite state machine; therefore, you can never be certain if there is some small bit of state skipped over if you start in the middle. The file classifier isn't this meticulous: it tries to save some time by using a heuristic to guess if the file is an executable. The cases covered are hoped to be robust enough to catch all of the files that are executable. The test runs have proven, on verification by hand, to be 100% successful thus far.



boogles@martigny.ai.mit.edu