Since I forgot to provide the install.bat file in the CD I handed in, here's the files all bundled together as they would be for a full install: http://www.sendspace.com/file/b3sifg
Note: It's before 2pm, technically still before the hand-in date!
Aether Anti-Virus Project
This is a blog related to my final year project. It will be used to express my ideas as I design and develop an anti-virus program.
Wednesday 4 May 2011
Tuesday 3 May 2011
The end is only the beginning
Today is hand-in day!* Aether is working nicely, and my code is documented. I'll soon be off to catch a train, ready to hand in my completed work. I've learnt a lot down the line, and there have been high moments (first full detection and removal!) and low moments (having to re-implement IPC for the second time, gah!) and I am glad I don't have to feel the pressure any more. But this isn't the end for Aether. It's such a fun and interesting project that I've decided to continue work on it as a personal project. Hopefully I can expand and improve it to the stage where it would in fact become a viable anti-malware solution.
* correction: hand in day turned out to be the 4th
* correction: hand in day turned out to be the 4th
Friday 22 April 2011
AppData in .NET service
Found an interesting .NET issue today.
This should result in:
[vista/7] C:\Users\Username\AppData\Roaming\Aether\AetherService.log
[winxp/2k] C:\Documents and Settings\Username\Application Data\Aether\AetherService.log
Since the service is running as SYSTEM, %AppData% actually turns out to be blank. It therefore creates a directory called %AppData% in C:\Windows\System32, with the contents being created under that. A quick workaround is to read the name of the logged on user manually and then create the path yourself.
if(!Directory.Exists(@"%AppData%\Aether")) Directory.Create(@"%AppData%\Aether"); File.WriteAllText(@"%AppData%\Aether\AetherService.log", "test");
This should result in:
[vista/7] C:\Users\Username\AppData\Roaming\Aether\AetherService.log
[winxp/2k] C:\Documents and Settings\Username\Application Data\Aether\AetherService.log
Since the service is running as SYSTEM, %AppData% actually turns out to be blank. It therefore creates a directory called %AppData% in C:\Windows\System32, with the contents being created under that. A quick workaround is to read the name of the logged on user manually and then create the path yourself.
Saturday 16 April 2011
Final push
I can't believe there's only two weeks to go before the deadline! There's a lot more work to do, but I'm confident. I've got definitions matching malware and I've finished 95% of the individual components that make up the project. The major process now is sticking it all together into a working product (can't be too difficult, can it?) and documenting it. Thankfully I've got a knack for creating documentation quickly.
Speaking of documentation, I created a cool little tool called Docugen that converts Visual Studio's XML project documentation into nicely formatted HTML pages. The idea is simple - you document your classes using inline XML documentation so that Visual Studio can produce a complete set of XML documentation for your assemblies, then you run the resulting documentation file through Docugen and it produces HTML in the specified target directory. It's relatively simple to insert this tool into your post-build script in your project too. Somewhat ironically, I've not really commented much of the code in Docugen. It's not officially a part of Aether, so I guess it doesn't matter too much. Anyhow, I spent most of today going through my code and inserting XML documentation comments to all of my classes, functions, fields, events and properties. It's tedious work, but it results in significantly more maintainable code.
I've had to make a few last-minute decisions regarding internal application structure. IPC, for example, has been plagued with problems and I'm still trying to find a method that works reliably. The KMD interaction is nice and it works perfectly for now. The KMD caused a system crash during a demo, but I can't isolate or reproduce the bug on my desktop. It's probably related to the stack/queue system I created for events, so I'll have to review that code later.
Speaking of documentation, I created a cool little tool called Docugen that converts Visual Studio's XML project documentation into nicely formatted HTML pages. The idea is simple - you document your classes using inline XML documentation so that Visual Studio can produce a complete set of XML documentation for your assemblies, then you run the resulting documentation file through Docugen and it produces HTML in the specified target directory. It's relatively simple to insert this tool into your post-build script in your project too. Somewhat ironically, I've not really commented much of the code in Docugen. It's not officially a part of Aether, so I guess it doesn't matter too much. Anyhow, I spent most of today going through my code and inserting XML documentation comments to all of my classes, functions, fields, events and properties. It's tedious work, but it results in significantly more maintainable code.
I've had to make a few last-minute decisions regarding internal application structure. IPC, for example, has been plagued with problems and I'm still trying to find a method that works reliably. The KMD interaction is nice and it works perfectly for now. The KMD caused a system crash during a demo, but I can't isolate or reproduce the bug on my desktop. It's probably related to the stack/queue system I created for events, so I'll have to review that code later.
Tuesday 1 March 2011
Languages
One thing people tend to ignore when creating applications is that once they are on the internet, anyone can download them. This means that potential end users may not speak English. In order to maximise sales, multilingual applications are pretty much a requirement. A quick and easy way to set up multilingual applications is to create a language definition file. Each entry is given a token to identify it across languages. This token can be in any language, but obviously an English speaking developer would be best to use English.
For example, the following might be part of a German language definition:
This allows the developer to parse the language file into a token-indexed array such as a Dictionary collection. In order to run on a particular language, the language definition file is loaded into the application and the code uses the individual tokens to display the message.
However, when we're working with an anti-malware application, we need to consider security as well. What if an attacker were to modify the language definition file to show a different message and convince the user to do something detrimental? As such, we need to secure the language file with a digital signature to prove its authenticity. This can be done with the same RSA keypair we use for storing definitions.
For example, the following might be part of a German language definition:
yes Ja no Nein cancel Abbrechen error Fehler register_complete Danke für die Registrierung.
This allows the developer to parse the language file into a token-indexed array such as a Dictionary collection. In order to run on a particular language, the language definition file is loaded into the application and the code uses the individual tokens to display the message.
MessageBox.Show(Language["register_complete"]);
However, when we're working with an anti-malware application, we need to consider security as well. What if an attacker were to modify the language definition file to show a different message and convince the user to do something detrimental? As such, we need to secure the language file with a digital signature to prove its authenticity. This can be done with the same RSA keypair we use for storing definitions.
Monday 14 February 2011
Driver and application intercommunication
One of the most fundamental principles of the ring protection model in modern CPUs is that intercommunication between processes in different rings has to be done in a very specific way as to provide a guarantee that code has a chance to check and sanitize incoming data. For example, the ring model would be near useless if a user-mode process could push items to the ring0 stack directly, or create kernel threads at arbitrary addresses. Mechanisms such as call gates and I/O Request Packets (IRPs) provide a way to interoperate between user- and kernel-mode.
It is possible, through a Windows API called DeviceIoControl, to send IRPs from user-mode to kernel-mode. Each driver has a list of handler functions that it can bind a function to, or not if it does not wish to handle that particular functionality. These functions are called 'major functions' and are assigned through the DRIVER_OBJECT->MajorFunction array. One of these major functions is designated IRP_MJ_DEVICE_CONTROL, which is called whenever an IRP is sent to the driver. When pointed at an appropriate handler function, the driver intercepts this IRP and can process it however it wants. The most important parts of the IRP structure (at least when using buffered IO) are the IoControlCode and the SystemBuffer parameters. When the user-mode application sends an IRP to the kernel-mode driver, it specifies a control code to define what it is asking the driver to do. It may also provide a buffer for input or output data. The driver performs the task specified by the control code given, utilising SystemBuffer to take parameters or send back data to the user-mode application.
In AetherAV there is a global event object defined under BaseNamedObjects that both user- and kernel-mode code has access to. When a new event occurs, the kernel-mode driver signals this event and the user-mode application sends a DeviceIoControl request back to the driver in order to find out certain information about the event, and finally invokes a managed event to tell the rest of the application about the event.
One issue we have to handle is preemption - e.g. new thread event occurs, timer signalled, application sends IRP to ask for new thread ID, driver sends value back, a second new thread event occurs and overwrites thread ID and process ID values, application sends IRP to ask for new thread's associated process ID, wrong process ID sent back. To stop this, we set up a kernel-only event object that is waited upon until the application has asked for all the required information about the event. This prevents new updates from occuring before the application has asked about the event. If new thread/process starts have occured since then, the system creates a backlog queue of IRPs to the notification handler and the driver will repeatedly signal the user-mode application about these new events until the backlog has been cleared.
It is possible, through a Windows API called DeviceIoControl, to send IRPs from user-mode to kernel-mode. Each driver has a list of handler functions that it can bind a function to, or not if it does not wish to handle that particular functionality. These functions are called 'major functions' and are assigned through the DRIVER_OBJECT->MajorFunction array. One of these major functions is designated IRP_MJ_DEVICE_CONTROL, which is called whenever an IRP is sent to the driver. When pointed at an appropriate handler function, the driver intercepts this IRP and can process it however it wants. The most important parts of the IRP structure (at least when using buffered IO) are the IoControlCode and the SystemBuffer parameters. When the user-mode application sends an IRP to the kernel-mode driver, it specifies a control code to define what it is asking the driver to do. It may also provide a buffer for input or output data. The driver performs the task specified by the control code given, utilising SystemBuffer to take parameters or send back data to the user-mode application.
In AetherAV there is a global event object defined under BaseNamedObjects that both user- and kernel-mode code has access to. When a new event occurs, the kernel-mode driver signals this event and the user-mode application sends a DeviceIoControl request back to the driver in order to find out certain information about the event, and finally invokes a managed event to tell the rest of the application about the event.
One issue we have to handle is preemption - e.g. new thread event occurs, timer signalled, application sends IRP to ask for new thread ID, driver sends value back, a second new thread event occurs and overwrites thread ID and process ID values, application sends IRP to ask for new thread's associated process ID, wrong process ID sent back. To stop this, we set up a kernel-only event object that is waited upon until the application has asked for all the required information about the event. This prevents new updates from occuring before the application has asked about the event. If new thread/process starts have occured since then, the system creates a backlog queue of IRPs to the notification handler and the driver will repeatedly signal the user-mode application about these new events until the backlog has been cleared.
Saturday 12 February 2011
IPC in Aether
Anyone who's ever dealt with unmanaged memory across multiple managed processes will know that it becomes very cumbersome to handle, especially when some of your code is single-threaded and some is multi-threaded. The hardest part is to synchronise the access of data. A mutex on a memory section can often be the best choice, since it works across the whole system. Unfortunately this solves but a single problem in a sea of glitches and issues. As such, the IPC in Aether has been transformed to use UDP packets instead.
Since UDP is a network protocol, one might assume that the packets would be sent across the network. If this were the case, it could open us up to significant security issues. As it turns out though, it doesn't end up being a problem. If any IPv4 packet is sent to localhost (127.0.0.1) the packet is intercepted by the socket provider and immediately copied back into the input buffer and dropped from the output buffer before it is sent to the hardware.
Since the .NET framework allows for simple access to the UDP protocol through the UdpClient class, it is relatively simple to combine asynchronous UDP sockets with a binary serialization class (BinaryFormatter) and wrapper classes to produce very reliable event-driven IPC.
Since UDP is a network protocol, one might assume that the packets would be sent across the network. If this were the case, it could open us up to significant security issues. As it turns out though, it doesn't end up being a problem. If any IPv4 packet is sent to localhost (127.0.0.1) the packet is intercepted by the socket provider and immediately copied back into the input buffer and dropped from the output buffer before it is sent to the hardware.
Since the .NET framework allows for simple access to the UDP protocol through the UdpClient class, it is relatively simple to combine asynchronous UDP sockets with a binary serialization class (BinaryFormatter) and wrapper classes to produce very reliable event-driven IPC.
Subscribe to:
Posts (Atom)