1 files changed, 160 insertions, 54 deletions
diff --git a/report/report.tex b/report/report.tex
index efc976d..133c2d6 100755
--- a/report/report.tex
+++ b/report/report.tex
@@ -56,15 +56,21 @@ Sorting is one of the most fundamental operations in programming. Sorting
 algorithms play a crucial role in efficient searching, organization,
 interpretation of data, and optimization problems. Furthermore, sorting is often
 times the first step towards designing complicated algorithms in computer
-science. This project aims to study efficient sorting of a list of coordinates,
+science.
+\smallskip
+
+\noindent
+This project aims to study efficient sorting of a list of coordinates,
 when sorting by the second coordinate. This will be done solely in
 \textit{x86\_64 Linux Assembly}. Although abstract high level languages such as
 \textit{Python} and \textit{Java} allow faster and convenient development,
 Assembly provides precise control over program flow, memory usage and
 instruction level execution, making Assembly an excellent environment for
 benchmarking and performance analysis.
+\smallskip
 
-Two sorting algorithms quicksort and insertionsort will be implemented and
+\noindent
+Two sorting algorithms \textit{quicksort} and \textit{insertionsort} will be implemented and
 compared against each other. Several test instances consisting of uniformly random coordinates will be
 created with the goal of comparing the implementations actual runtime to their
 theoretical runtimes and evaluate each implementation.
@@ -72,13 +78,23 @@ theoretical runtimes and evaluate each implementation.
 \section{Design}
 
 \subsection{Program Flow}
-The program is structured into four main phases: selecting an algorithm,
-converting the inputfile to the array format, sorting that array, and printing
-the array to stdout. Each phase is designed to handle a specific task, which
-creates modularity and clarity. Within each phase a set of functions manage the
+The program is structured into four main phases:
+
+\noindent
+\textbf{selecting} an algorithm, \textbf{converting} the inputfile to the array format,
+\textbf{sorting} that array, and \textbf{printing} the array to \textit{stdout}.
+\smallskip
+
+\noindent
+Each phase is designed to handle a specific task, which
+creates modularity and clarity. Within each \textit{phase} a set of functions manage the
 corresponding operations, allowing the programmer to focus on one task at a
-time. This design also greatly supports debugging by simplifying tracking and
-solving errors. Below is a flowchart visualizing each step, followed by a short
+time. This design also greatly supports \textit{debugging} by simplifying tracking and
+solving errors.
+\smallskip
+
+\noindent
+Below is a flowchart visualizing each step, followed by a short
 summary of the design.
 
 \begin{figure}[H]
@@ -88,19 +104,31 @@ summary of the design.
     \label{fig:flowchart}
 \end{figure}
 
+\noindent
 \textbf{Parsing the data:} After reading the input, the received file is to be
-parsed, meaning that only useful data will be extracted. This will in turn
-significantly simplifies iteration through the data, by ensuring that only
-meaningful are left. Lastly an important design choice has been creating an
-array of pointers to each coordinate pair. This design is explained in further
+\textit{parsed}, meaning that only useful data will be extracted. This will in turn
+significantly simplifies \textit{iteration} through the data, by ensuring that only
+meaningful are left.
+\smallskip
+
+\noindent
+Lastly an important design choice has been creating an
+array of \textit{pointers} to each coordinate pair. This design is explained in further
 detail later. 
+\medskip
 
+\noindent
 \textbf{Sorting:} After parsing, a sorting algorithm of choice can be used on
 the coordinates, to sort by a preferred coordinate. The chosen algorithm can be
-passed as a command line argument, otherwise a default algorithm will be run.
+passed as a \textit{command line argument}, otherwise a default algorithm will be run.
+\smallskip
+
+\noindent
 This design where an algorithm can be passed to the sorting function, allows the
 programmed to be extended with new algorithms if needed.
+\smallskip
 
+\noindent
 \textbf{Generating output:} The last step of the program is converting the
 parsed data back into a format that can be printed.
 
@@ -108,25 +136,43 @@ parsed data back into a format that can be printed.
 Each line contains one and only one coordinate consisting of two values
 formatted as \texttt{$\langle$x$\rangle$\textbackslash
 t$\langle$y$\rangle$\textbackslash n}, where \texttt{x} and \texttt{y} are
-non-negative integers in the range 0 to 32767. Although limiting, this range is
+non-negative integers in the range 0 to 32767.
+\smallskip
+
+\noindent
+Although limiting, this range is
 a design decision taken in order to avoid unnecessary complexity and keep focus
 on the main objective of the project, efficient sorting. Including negative
 numbers would for example require extra logic for parsing, comparison,
-converting back and forth to ASCII, and allocating memory. Both input and output
-follow this strict format.
+converting back and forth to ASCII, and allocating memory.
+\smallskip
+
+\noindent
+Both input and output follow this strict format.
 
 \subsection{Selecting algorithms}
 The ability to choose which algorithm to use can be quite beneficial, if you
 know certain aspects about your data. Therefore a design choice has been allowing
 the user to select an algorithm as a command line argument, rather than hard coding
-it. This design decision is great for flexibility, and also supports future exceptions
+it.
+\smallskip
+
+\noindent
+This design decision is great for flexibility, and also supports future exceptions
 of the program.
 
 \subsection{Why Insertionsort and Quicksort?}
 Insertion sort and quick sort have been handpicked to research the trade-offs
 between algorithmic simplicity and computational efficiency. Furthermore,
-these algorithms are widely different in nature. Insertion sort is an iterative
-algorithm, depending primarily on memory usage, registers, and program flow,
+these algorithms are widely different in nature.
+\smallskip
+
+\noindent
+Insertion sort is an iterative algorithm, depending primarily on memory usage,
+registers, and program flow,
+\smallskip
+
+\noindent
 while quick sort is a recursive algorithm using the divide and conquer approach.
 This is interesting, because Quicksort is expected to excel on large random
 datasets, while insertion sort is more suitable for small or nearly sorted
@@ -193,29 +239,19 @@ we simply swap the memory addresses around.
 \section{Implementation}
 This four phase design has been converted into Assembly code using the x86\_64
 instruction set. Furthermore the design has been split up in modules, where each
-module is a collection of closely related functions. For example
-\textit{array\_maker.s} contains functions meant to parse data and convert it to
+module is a collection of closely related functions.
+\smallskip
+
+\noindent
+For example \textit{array\_maker.s} contains functions meant to parse data and convert it to
 the array format. All functions in this program follow the x86 calling
-conventions, with a few exceptions where all registers are preserved. Below is a
-detailed explanation of some of the more interesting functions.
+conventions, with a few exceptions where all registers are preserved.
+\smallskip
 
-\subsection{Array maker}
-We implemented a global function \texttt{array\_maker}, that takes the
-\textit{file descriptor} as input, and returns a pointer to the start of the
-array, and the number of elements. With this it then first calls a function
-\texttt{create\_file\_buffer}, that reads the contents of the file and puts it
-in memory. It then passes this address, and the length of the file onto the
-function \texttt{make\_pairwise\_data}. This function removes
-\texttt{\textbackslash t} and \texttt{\textbackslash n}, and puts the integers
-side-by-side, with space for 64-bit integers. This format is nice, because it
-allows us to iterate over the number of coordinates, and select every pair $i$
-by going to position $2i$, and using that as the place in memory where the
-$i^\text{th}$ pair in the array points to. The psudocode for this can be seen on
-\autoref{fig:arraymaker}.\footnote{The assembly code is quite easy to interpret
-from this psudocode, hence the psudocode instead of assembly. This also makes it
-easier to understand.}
+\noindent
+Below is a detailed explanation of some of the more interesting functions.
 
-\begin{wrapfigure}[18]{r}{0.5\textwidth}
+\begin{wrapfigure}{r}{0.6\textwidth}
     \textbf{Input\ \ :} File descriptor $f$\\
     \textbf{Output:} Pointer and size of array $A$\\
     \\
@@ -236,6 +272,33 @@ easier to understand.}
     \textbf{return} $A$ and $n$
     \caption{Pseudocode for \texttt{array\_maker}}\label{fig:arraymaker}
 \end{wrapfigure}
+\subsection{Array maker}
+We implemented a global function \texttt{array\_maker}, that takes the
+\textit{file descriptor} as input, and returns a pointer to the start of the
+array, and the number of elements.
+\smallskip
+
+\noindent
+With this it then first calls a function
+\texttt{create\_file\_buffer}, that reads the contents of the file and puts it
+in memory. It then passes this address, and the length of the file onto the
+function \texttt{make\_pairwise\_data}. 
+\smallskip
+
+\noindent
+This function removes
+\texttt{\textbackslash t} and \texttt{\textbackslash n}, and puts the integers
+side-by-side, with space for 64-bit integers. This format is nice, because it
+allows us to iterate over the number of coordinates, and select every pair $i$
+by going to position $2i$, and using that as the place in memory where the
+$i^\text{th}$ pair in the array points to.
+\smallskip
+
+\noindent
+The psudocode for this can be seen on
+\autoref{fig:arraymaker}.\footnote{The assembly code is quite easy to interpret
+from this psudocode, hence the psudocode instead of assembly. This also makes it
+easier to understand.}
 
 \newpage
 
@@ -362,7 +425,9 @@ To evaluate whether out program works, we created three \textit{shell scripts},
 one for generating test data (\textit{generate\_test\_data.sh}), one for testing
 the validity of the output of the program (\textit{test.sh}) and one for testing
 the execution time of the program (\textit{benchmark.sh}).
+\smallskip
 
+\noindent
 The script that generates test files, generates files of size $n$ for all
 $\{1000n\mid 0\leq n\leq 100\}$. The size is equivalent to the number of
 coordinates. For each $n$ we also create three different kinds of test files.
@@ -370,12 +435,18 @@ One where all of the data is random, one where the data is already sorted, and
 one where it is reversely sorted.\footnote{We also did run it for $n=$ 1 million
 with random, but to get a pretty plot we decided to omit these values, so it was
 not too sparse. Quicksort ran for 0.159s and Insertionsort ran for 386.935s.
-This makes Quicksort roughly 2400 times faster than Insertionsort.} This way we
+This makes Quicksort roughly 2400 times faster than Insertionsort.}
+\smallskip
+
+\noindent
+This way we
 can see both the average case, and worst case of any algorithm we decide to use.
 Now we can use our \textit{test.sh} script, which runs the program with each of
 the generated test files for all sorting algorithms of the program, and compares
-the output with the output from the \textit{builtin} function \textit{sort}.
+the output with the output from the \textit{GNU Core Utilities} program \textit{sort}.
+\smallskip
 
+\noindent
 Given all of the tests pass, we can then run \textit{benchmark.sh}, which times
 how long it takes for our program to run with each test file for each algorithm,
 and saves the result in a \textit{csv} file. The results of this \textit{csv}
@@ -432,14 +503,18 @@ n)$ and $\mathcal O(n^2)$, is shown on \autoref{fig:benchmarktime}.
     both random, sorted and reverse sorted data, plotted with a logarithmic y
     axis}\label{fig:benchmarktime}
 \end{figure}
+\smallskip
 
+\noindent
 From \autoref{fig:benchmarktime} we see that Insertionsort behaves as expected
 with a best-case runtime of $\mathcal O(n)$ when the input is sorted, and a
 worst-case runtime of $\mathcal O(n^2)$, when the input reversely sorted or
 random. We also see that Quicksort behaves as expected with a best-case runtime
 of $\mathcal O(n\log n)$ when the input is random, and a worst-case runtime of
 $\mathcal O(n^2)$ when the input is either sorted or reversely sorted.
+\smallskip
 
+\noindent
 It is however also possible to count the number of comparisons for each
 algorithm. This plot can be seen on \autoref{fig:benchmarkcomp}. Here we clearly
 see that, for example, Insertionsort has exactly $n$ comparisons, when sorting
@@ -501,35 +576,66 @@ elements, matches - within a scalar - with $n\log n$.
 \subsection{Nonalgorithmic factors}
 Although the asymptotic run times are caused by the algorithms themselves,
 the total run times are also greatly affected by the concrete implementations
-of highly repeated functions. A good example is our old generateOutput function,
-which was extremely slow.  This function has since been removed and exchanged with
+of highly repeated functions.
+\smallskip
+
+\noindent
+A good example is our old generateOutput function, which was extremely slow.
+This function has since been removed and exchanged with
 a new approach using a buffer. The implementation looped over the sorted list and
-used 4 write syscalls per coordinate: One for printing x-coordinate, one for
-printing y, one for tab, and one for a newline character. Thus 4 million syscalls
-would be needed just to print a sorted list of 1 million coordinates, which causes
-an enormous program overhead. So the point is that even though the problem studied
+used 4 write syscalls per coordinate:
+\smallskip
+
+\noindent
+One for printing x-coordinate, one for
+printing y, one for tab, and one for a newline character.
+Thus 4 million syscalls would be needed just to print a sorted list of 1 million coordinates, which causes
+an enormous program overhead.
+\smallskip
+
+\noindent
+So the point is that even though the problem studied
 is efficient sorting, correct usage of registers, memory and syscalls will greatly
 affect and boost performance. 
 
 \section{Conclusion}
 In this project insertionsort and quicksort have been implemented in assembly x86 64 
 with the goal of studying efficient sorting. Although the algorithms perform the same task,
- they are in fact very different. Firstly, from the benchmarks we have learned that the
+they are in fact very different.
+\smallskip
+
+\noindent
+Firstly, from the benchmarks we have learned that the
 algorithms are heavily affected by the dataset itself. Both algorithms have very
 different run times on the sorted, reversed, and random datasets, despite the sets
-having the same size. As discussed, earlier insertionsort and quicksort have roughly
+having the same size.
+\smallskip
+
+\noindent
+As discussed, earlier insertionsort and quicksort have roughly
 the same performance on reversed datasets, insertionsort is optimal on sorted or
 small datasets, and quicksort excels on random and very large datasets. So one
 could say that efficient sorting requires an efficient algorithm well suited for
 the dataset. But it should also be mentioned that in most general cases with
-random output quicksort will most likely outperform insertionsort. Furthermore,
-from the testings we have learned that the stable tests used on insertion sort
+random output quicksort will most likely outperform insertionsort.
+\smallskip
+
+\noindent
+Furthermore, from the testings we have learned that the stable tests used on insertion sort
 cannot be used on quicksort, since quicksort doesn’t preserve the order the inputs.
 This means that the algorithms might produce different results even if used
-on the same dataset. Another important observation has been that although the algorithms
+on the same dataset.
+\smallskip
+
+\noindent
+Another important observation has been that although the algorithms
 themselves determine the asymptotic run times, the concrete implementations also matter
 for the total run times. Which means that memory usage, registers, stack management, and
-syscalls should also be considered while trying to create a fast sorting algorithm. Summing
-all these observations up in one sentence, we can conclude that a recipe to efficient sorting
+syscalls should also be considered while trying to create a fast sorting algorithm.
+\smallskip
+
+\noindent
+Summing all these observations up in one sentence,
+we can conclude that a recipe to efficient sorting
 is choosing a well suited algorithm supported by a smart implementation. 
 \end{document}