Adjusting tex docs

2025-12-16 23:33:06 +08:00 · 2022-07-01 11:44:40 -04:00
parent bc4cdcee11
commit e95cc18d49
5 changed files with 6 additions and 22 deletions
--- a/docs/chapters/chapter1.tex
+++ b/docs/chapters/chapter1.tex
@@ -125,7 +125,7 @@ including proof of concepts (PoC) showing specific features, and also by
 building a realistic rootkit system which leverages these PoCs and
 integrates them into a fully operational implant.
-%According to the library guide, previous research should be around here. %Is it the best place tho?
+%According to the library guide, previous research should be around here.
 Before narrowing down our objectives and selecting a specific list of
 rootkit capabilities to provide using eBPF, we analyze previous research in
 this area. The work by Jeff Dileo from NCC Group at DEFCON 27
@@ -179,7 +179,7 @@ The rootkit will work in a fresh-install of a Linux system with the following ch
 \subsection{Social and economic environment}\label{sec:social_econ_env}
 %M-> Mentioned talking about community outreach and its role under pentesting
-%TODO Talk about the difference between having always on BPF and always on kernel modules, BPF is consider "safe" in production while it's almost as dangerous (I think this might fit here)
+%Talk about the difference between having always on BPF and always on kernel modules, BPF is consider "safe" in production while it's almost as dangerous 
 Our world has a growing dependency on digital systems. From the use of
 increasingly complex computer systems and networks in business environments
@@ -318,7 +318,6 @@ Finally, it must be noted that this project uses the libbpf library
 \cite{libbpf_github}, as described in Section \ref{subsection:libbpf}, for
 the development of our eBPF rootkit. This library is licensed under dual
 BSD 2-clause license and GNU LGPL v2.1 license. 
 %Should I say something else? I usually license my own projects under GPLv3 because I don't like corporations taking the code, but I guess I am restricted to use the Creative Commons license.
 \section{Structure of the document}
@@ -341,10 +340,8 @@ This section details the structure of this document and the contents of each cha
 \textbf{Chapter 8: Conclusions and future work} revisits the project objectives, discusses the work presented in this document, and describes possible future research lines.
 \section{Code availability}
 %Is it ok to reference the repo as a cite? Maybe it's better writing the link directly?
 All the source code belonging to the rootkit development can be visited publicly at the GitHub repository \url{https://github.com/h3xduck/TripleCross} \cite{triplecross_github}. The most important folders and files of this repository are described in Table \ref{table:triplecross_dirs}.
 %I can go with more detail if needed. Is it needed?
 \begin{table}[htbp]
 \begin{tabular}{|>{\centering\arraybackslash}p{4cm}|>{\centering\arraybackslash}p{10cm}|}
 \hline
--- a/docs/chapters/chapter2.tex
+++ b/docs/chapters/chapter2.tex
@@ -310,7 +310,6 @@ BPF\_MAP\_TYPE\_PROG\_ARRAY & Stores descriptors of eBPF programs\\
 \subsection{The eBPF ring buffer} \label{subsection:bpf_ring_buf}
 eBPF ring buffers are a special kind of eBPF maps, providing a one-way directional communication system, going from an eBPF program in the kernel to a user space program that subscribes to its events.
 %TODO DIAGRAM OF A TYPICAL RING BUFFER
 \subsection{The bpf() syscall} \label{subsection:bpf_syscall}
 The bpf() syscall is used to issue commands from user space to kernel space in eBPF programs. This syscall is multiplexor, meaning that it can perform a great range of actions, changing its behaviour depending on the parameters.
@@ -519,8 +518,6 @@ bpf\_skb\_change\_tail() & Enlarges or reduces the extension of a packet, by mov
 \label{table:tc_helpers}
 \end{table}
 %TODO This section might benefit from some diagrams, maybe. It was a bit to extense already, so skipping it from now
 \subsection{Tracepoints} \label{subsection:tracepoints}
 Tracepoints are a technology in the Linux kernel that allows to hook functions in the kernel, connecting a 'probe': a function that is executed every time the hooked function is called \cite{tp_kernel}. These tracepoints are set statically during kernel development, meaning that for a function to be hooked, it needs to have been previously marked with a tracepoint statement indicating its traceability. At the same time, this limits the number of tracepoints available.
@@ -555,7 +552,6 @@ Similarly to kprobes, uprobes have access to the parameters received by the hook
 In eBPF, programs can issue a bpf() syscall with the command BPF\_PROG\_LOAD and the program type BPF\_PROG\_TYPE\_UPROBE, specifying the function with the uprobe to attach to and an arbitrary function probe to call when it is hit. This function probe is also defined by the user in the eBPF program submitted to the kernel.
 % Is this the best title?
 \section{Developing eBPF programs}
 In Section \ref{section:modern_ebpf}, we discussed the overall architecture of the eBPF system which is now an integral part of the Linux kernel. We also studied the process which a piece of eBPF bytecode follows in order to be accepted in the kernel. However, for an eBPF developer, programming bytecode and working with bpf() calls natively is not an easy task, therefore an additional layer of abstraction was needed. 
@@ -583,7 +579,6 @@ Libbpf \cite{libbpf_github} is a library for loading and interacting with eBPF p
 As we discussed in Section \ref{section:modern_ebpf}, eBPF programs are composed of both the eBPF code in the kernel and a user space program that can interact with it. With libbpf, the eBPF kernel program is developed in C (a real program, not a string later compiled as with BCC), while user programs are usually developed in C, Rust or GO. For our project, we will use the C version of libbpf, so both the user and kernel side of our rootkit will be developed in this language.
 % Cites in the following paragraph?
 When using libbpf with the C language, both the user-side and kernel eBPF program are compiled together using the Clang/LLVM compiler, translating C instructions into eBPF bytecode. As a clarification, Clang is the front-end of the compiler, translating C instructions into an intermediate form understandable by LLVM, whilst LLVM is the back end compiling the intermediate code into eBPF bytecode. As it can be observed in Figure \ref{fig:libbpf}, the result of the compilation is a single program, comprising the user-side which will launch a user process, the eBPF bytecode to be run in the kernel, and other structures libbpf generates about eBPF maps and other meta data. This program is encapsulated as an ELF file (a common executable format).
 \begin{figure}[htbp]
@@ -661,7 +656,7 @@ Table \ref{table:ebpf_kernel_flags} is based on BCC's documentation, but the ful
 \subsection{Access control} \label{subsection:access_control}
 It must be noted that, similarly to kernel modules, loading an eBPF program requires privileged access in the system. In old kernel versions, this means either a user having full root permissions, or having the Linux capability \cite{ubuntu_caps} CAP\_SYS\_ADMIN. Therefore, there existed two main options:
-%TODO some words about capabilities
+
 \begin{itemize}
 \item \textbf{Privileged users} can load any kind of eBPF program and use any functionality.
 \item \textbf{Unprivileged users} can only load and attach eBPF programs of type BPF\_PROG\_TYPE\_SOCKET\_FILTER \cite{evil_ebpf_p9}, offering the very limited functionality of filtering packets received on a socket.
--- a/docs/chapters/chapter4.tex
+++ b/docs/chapters/chapter4.tex
@@ -150,7 +150,6 @@ In our rootkit, we will choose the first option, scanning the process virtual me
 \textbf{ASLR}\\
 In order to bypass ASLR, attackers must take into account that, although the address at which, for instance, a library is loaded is random, the internal structure of the library remains unchanged, with all symbols in the same relative position, as Figure \ref{fig:aslr_offset} shows.
 %TODO Add the .data section here
 \begin{figure}[htbp]
 	\centering
 	\includegraphics[width=13cm]{aslr_offset.jpg}
@@ -345,7 +344,6 @@ Therefore, our rootkit will modify GOT using bpf\_probe\_write\_user() with the
 \textbf{Stage 5: Second syscall, execution of the library}\\
 Once we have overwriten GOT with the address of our code cave, the next time the same syscall is called, the PLT stub will jump to our code cave and execute our shellcode. As instructed by it, the malicious library will be loaded and afterwards the flow of execution jumps back to the original glibc function.
 %Explain reverse shell?
 With respect to the malicious library, it forks the process (to keep the malicious execution in the background) and spawns a simple reverse shell which the attacker can use to execute remote commands.
@@ -704,7 +702,7 @@ This type of trigger has not been implemented in our rootkit, although it has be
 \textbf{Advanced pattern-based triggers}\\
 One of the main issues with keyword-based triggers is that, upon inspection of the packet, the trigger is easily recognizable (the payload contains a plaintext string) and this can lead to firewalls and IDSs flagging it as suspicious. 
-We can, however, work on top of the idea of building a pattern that can be recognized by the backdoor, but at the same time seems random enough for an external network supervisor. This is the basis of some of the triggers we can find in real-world rootkit, such is the case of the rootkit Bvp47 \cite{bvp47_report}. %TODO the link is too slow, should we put our repository as a source?
+We can, however, work on top of the idea of building a pattern that can be recognized by the backdoor, but at the same time seems random enough for an external network supervisor. This is the basis of some of the triggers we can find in real-world rootkit, such is the case of the rootkit Bvp47 \cite{bvp47_report}. 
 Bvp47 is a rootkit with C2 capabilities built as a Linux kernel module developed by the NSA Equation Group and discovered by the research laboratory Pangu Lab \cite{pangu_lab}. One of its capabilities is communicating with a backdoor via pattern-based triggers. These triggers are seemingly random, but they follow a hidden pattern that only the entity who knows it will be able to detect it, acting as a "key". The triggers used in the Bvp47 rootkit consist of a TCP packet whose payload has been filled with random memory, with the exception of a selection of bits which are the result of certain XOR operations \cite{bvp47_report_p49}.
@@ -831,8 +829,6 @@ By using the previous maps, the XDP program will first wait until 3 (or 6) packe
 If the previous checks do not fail, it means the packet stream was a multi-stream trigger and the XDP program proceeds to execute the action corresponding to K3.
 %TODO INTRODUCE IMAGES OF SHELLS
 \subsection{Command and Control} \label{subsection:c2}
 This section details the C2 capabilities incorporated in our rootkit, that is, mechanisms that enable the attacker to introduce rootkit commands (not to be confused with Linux commands in a shell) from the remote rootkit client and to be executed in the infected machine, returning the output of the command (if any) back to the client. These rootkit commands can be instructed by sending a backdoor trigger, which as we mentioned, depending on the value of K3 in the trigger, a different rootkit action will be executed by the backdoor (available values are displayed in Table \ref{table:k3_values}).
@@ -1467,7 +1463,6 @@ SECRETDIR & DT\_REG (4) & Secret directory where the rootkit hides its files.\\
 \label{table:dtype_dirent}
 \end{table}
 % Just ran out of time to implement this case properly, realized too late this was a thing. Still mentioning it here
 Also, it is of interest to study what would happen if the directory entry to hide was not in the middle of the buffer, but rather it was the first one to be written. In this case, we cannot modify the d\_reclen of the previous entry to trick the user into skipping an entry. In order to illustrate this case, we are providing another technique (although this functionality is not available in the rootkit currently). Figure \ref{fig:getdents_firstentry} illustrates this alternative process.
 As we can observe in the figure, this technique is based on removing the directory entry completely and overwriting it with all of the subsequent entries. After this change, only the return value of the system call would need to be changed (since now the buffer is shorter).
--- a/docs/chapters/chapter7.tex
+++ b/docs/chapters/chapter7.tex
@@ -134,7 +134,6 @@ HP OMEN 16-c0050ns & 1,300 € \\
 \subsection{Software costs}
 All software used during this research work is open source and thus it has no additional cost. This can be observed in Table \ref{table:software_costs}.
 %Ill add the version here
 \begin{table}[htbp]
 \begin{tabular}{|c|c|}
 \hline
@@ -157,7 +156,6 @@ Oracle VM Virtualbox & 0 € \\
 \subsection{Total costs}
 The computation of the total costs involves considering the costs of hardware, software and personnel systems, together with an additive indirect cost related to minor expenses such as Internet connection or electricity consumption. We will consider these costs to be a 10\% of the total. Additionaly, note that this is a research project and, as such, it would usually be funded, so we would not have any benefits. Table \ref{table:total_costs} shows the total costs of this project.
 %TODO improve the look of this table
 \begin{table}[htbp]
 \begin{tabular}{|c|c|}
 \hline
--- a/docs/document.tex
+++ b/docs/document.tex
@@ -141,7 +141,7 @@ hmargin=3cm
 \DeclareCaptionFormat{upper}{#1#2\uppercase{#3}\par}
 \captionsetup[table]{
-	%format=upper,  UPPER??? Set by the template, but it looks really weird, I got this off
+	%format=upper,  Set by the template, but it looks really weird, I got this off
 	justification=centering,
 	labelsep=period,
 	width=.75\linewidth,
@@ -161,7 +161,7 @@ hmargin=3cm
 	labelsep=period,
 	labelfont=small,
 	font=small,
-	%THE FOLLOWING WAS ADDED BY ME, is this ok? I think it was missed on the template
+	%I added the following, I think it was missed on the template
 	justification=centering		
 }
@@ -267,7 +267,6 @@ hmargin=3cm
 \thispagestyle{plainnofancy}
 \setcounter{page}{3}
 	% So I read that acronyms are not allowed in abstracts and I should write the full name. At the same time, the official ebpf page says it is not an acronym anymore...
 eBPF is a technology introduced in the 3.18 version of the Linux kernel that allows running code in the kernel without the need of loading a kernel module. Although originally intended for filtering packets, eBPF programs can be used for network monitoring, accessing kernel-exclusive resources and tracing activities at the user and kernel space. This has positioned eBPF as a leading environment for the development of network, security and observability tools. During the last years, however, eBPF has been found to be at the heart of the latest innovation on the development of rootkits.