DADA: HW 3: Tricky Jump
Go up to the main DADA homeworks page (md)
Introduction
This assignment will explore what it takes to create a stealthy virus that employs a “tricky jump.” A tricky jump is a form of hijacking in which a jump is inserted to call some virus code. The jump is inserted in such a way that after the virus code runs, the program continues normal execution, thereby maintaining stealth.
This program MUST run on the VirtualBox image (md) provided for this course. You have to write it in either C or C++.
This homework was taken, with permission, from a homework created by Charles Reiss, which was taken – again, with permission – from one created by Jack Davidson.
Task
A “tricky jump” can be efficiently implemented (only six bytes) as:
pushq $AddressOfVirusFunction
ret
This can be encoded on x86-64 using only six bytes, and the encoding does not change based on where the push instruction is placed. This makes it easy to compute the machine code separately from inserting it somewhere, and so has been commonly seen in viruses.
One could also implement a “tricky jump” by inserting a conventional jump instruction:
jmp AddressOfVirusFunction
However, a jmp instruction uses relative
addresses (whereas pushq uses absolute addresses),
so the resulting machine code will change based on where the jump is
inserted.
When either sequence is executed, control is diverted to the virus
code. The tricky jump pushes the virus address onto the stack, and then
the standard ret at the end of the (infected) subroutine
jumps to the virus code. When the virus is done, it calls
ret which returns to the actual caller of the infected
function. If the virus writer inserts the tricky jump at the end of an
application function (i.e, to replace the ret), then the
program, after the virus code executes, will continue to run as if
nothing happened. For example, one might see code like like:
400661: c3 retq
400662: 66 66 66 66 66 2e 0f data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
400669: 1f 84 00 00 00 00 00
data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) is
objdump’s representation of a 14-byte long nop instruction. This is
padding added at the end of the function. This is a “cavity” that gives
a virus writer some room to work. If we insert a “tricky jump” starting
where the retq instruction is located (address 0x400661), then the virus
code will be invoked. When the virus code returns, control will be
returned to the function that invoked this function.
For this assignment, you will write a C program that infects a particular Linux executable and causes some virus code to be executed.
The Linux executable you want to infect is called target.exe, but
that file is not included in this repository. You can download the target.c source code and compile it with:
gcc -falign-functions=16 -o target.exe target.c.
target.exe produces the following output:
Initialize application.
Begin application execution.
Terminate application.
(After downloading target.exe, you may need to mark it
as executable with a command like chmod +x target.exe. Then
you should be able to run it using ./target.exe.)
Your program should modify target.exe into a
target-infected.exe which will produce the following
output:
Initialize application.
You have been infected with a virus!
Begin application execution.
Terminate application.
Note: add the second line exactly as is, as the auto-grading scripts will be looking for that line. If you add extra spaces, spelling mistakes, different punctuation, etc., you will lose points!
You will use the “tricky jump” method of infection. The push version is probably the easiest to use, but you may use any technique. To simplify this assignment:
- The executable has a large “hole” (unused space filled with nops) in which to place the non-malicious “virus code”, and we will supply working “virus code” for you.
- You only need to handle infecting this particular executable, but we expect your infection program to be fairly easy to port to new executables. (For example, you should not just have a copy of the output file inside your C file.)
The “virus” code we want you to insert is the following (also available as a .s file or a .o file):
leal string(%rip), %edi
pushq $0x4004e0 /* address of puts in target executable */
retq
string:
.asciz "You have been infected with a virus!"
You can copy the resulting machine code into the large cavity in the
executable. This assembly code is carefully written to not require
changes to the machine code depending on where in the executable it is.
(This is why it does not call puts with a jmp
or call instruction or use mov $string, %edi.)
It will, however, not work in other executables because it hard-codes
the address of puts in this executable. (The simplest way
to avoid this problem would be to replace the call to puts
with a direct use of the system call used to implement
puts.)
Submit a C program that when compiled an executed reads a C
executable called target.exe and produces an executable
called target-infected.exe.
target-infected.exe must be the same length as
target.exe.
Also, answer the following questions:
How did you identify the file offsets in
target.exeto overwrite?How did you produce the machine code to insert for the tricky jump to the virus code?
If your infect.c has a hard-coded offset or something similar, how would you automate finding the location in
target.exeto overwrite with a tricky jump so that it would work on other target programs? (For this question, ignore the problem of fixing the inserted “virus” code to work in other executables.)
Submission
Submit the following files:
- Your
infect.corinfect.cpp(we don’t care if you do it in C or C++, but it must be in one of those) - A
Makefilethat will compile your file into an executable nameda.out - A file
answers.txtcontaining the answers to the above questions
The names matter, as the autograder will mark points off if they are not what is expected.
When we run your program, we will put the specified
target.exe in the same directory as the a.out
executable, and we will expect the result to be a file named
target-infected.exe.
Methodology and Hints
You should use the utility objdump to examine the executable
target.exe. The option--disassembleis useful. In particular, you need to determine the starting address of the virus code. The disassembly will also help you determine the opcodes of the instructions that you need to insert (i.e., apushinstruction and aretinstruction). You may wish to consult the objdump manual (man objdump).Identify where the constant stings “Initialize application” and “Begin application execution” are referenced to locate relevant parts of the application code.
Look for a large area of
nopopcodes in the disassembly to determine where to insert the virus code. Record the address of this location in memory to generate the “tricky jump” code you will insert elsewhere in the executable.To insert both the virus code and the tricky jump itself, the trick is that you must map the address of the location in the executable to the offset of the proper byte in the file. You need to do this mapping because the file offset where you want to write is not the same as the address of the instruction when the program is loaded in memory (which is what objdump usually shows you).
- One option is to figure out what options you can pass to objdump to get it to display the offset of code within the executable file.
- Another option is to get a hexadecimal dump of the raw file and look for bytes shown in objdump output in the actual executable file to find their location.
- Yet another option would be for your infect.c to search for particular bytes in the executable file itself.
A
pushof a 32-bit constant (on 32- or 64-bit x86) can be encoded as an0x68byte followed by the (little-endian) constant. Aretis encoded asc3. A jump can be encoded as an0xe8byte followed by a 32-bit offset from the address of the following instruction.A very useful program to examine the file is a hex editor such as ghex. You can install
ghexusingsudo apt-get install ghex.To simplify the assignment, you can hardcode the input and output file names in your infect program. That is, infect.c opens and reads
target.exeand opens and writestarget-infected.exe. After you producetarget-infected.exeyou will probably need to set the execute permissions on the file (your program does not have to set those itself; that can be done manually).To read from and write to a binary file in C, you can use
fopen,fread, andfwrite. You can runman fopen,man fread, etc., to get documentation for how these functions are called, or search online. An example usage of a program that copies “input.dat” to “output.dat” is the following:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
FILE *in;
FILE *out;
char *buffer;
int size;
in = fopen("input.dat", "rb");
/* get size of input.dat, by
moving to the end of the file */
fseek(in, 0, SEEK_END);
size = ftell(in);
/* then, return to the
beginning of the file */
fseek(in, 0, SEEK_SET);
buffer = malloc(size);
fread(buffer, 1, size, in);
fclose(in);
out = fopen("output.dat", "wb");
fwrite(buffer, 1, size, out);
fclose(in);
}
The hard part is figuring out what locations in the file need to be changed and what they should be changed to. The code to do the infection is small.
We are reading and writing binary files, not text files. You may need to open files in binary mode, next text mode.
The virus code we’ve given finishes by returning with a
retinstruction. (This is actually by returning fromputs().) So wherever you insert the virus function needs to be a place where it is safe to return from. If you are experiencing a segfault after the virus code prints out its message, this is the most likely reason why.