From 5909443dc9b16aedb81f2d09f33d1d8efa48bf33 Mon Sep 17 00:00:00 2001 From: Andreas Kapp Lindquist Date: Thu, 30 Oct 2025 09:06:20 +0100 Subject: report: Run through and fixes --- report/report.tex | 220 +++++++++++++++++++++++++----------------------------- 1 file changed, 101 insertions(+), 119 deletions(-) (limited to 'report/report.tex') diff --git a/report/report.tex b/report/report.tex index 277acaa..7cfc7a6 100644 --- a/report/report.tex +++ b/report/report.tex @@ -81,16 +81,17 @@ theoretical runtimes and evaluate each implementation. The program is structured into four main phases: \noindent -\textbf{selecting} an algorithm, \textbf{converting} the inputfile to the array format, -\textbf{sorting} that array, and \textbf{printing} the array to \textit{stdout}. +\textbf{selecting} an algorithm, \textbf{converting} the inputfile to the array +format, \textbf{sorting} that array, and \textbf{printing} the array to +\textit{stdout}. \smallskip \noindent -Each phase is designed to handle a specific task, which -creates modularity and clarity. Within each \textit{phase} a set of functions manage the +Each \textit{phase} is designed to handle a specific task, which creates +modularity and clarity. Within each \textit{phase} a set of functions manage the corresponding operations, allowing the programmer to focus on one task at a -time. This design also greatly supports \textit{debugging} by simplifying tracking and -solving errors. +time. This design also greatly supports \textit{debugging} by simplifying +tracking and solving errors. \smallskip \noindent @@ -105,14 +106,14 @@ The flowchart below visualize each step. \noindent \textbf{Parsing the data:} After reading the input, the received file is to be -\textit{parsed}, meaning that only useful data will be extracted. This will in turn -significantly simplifies \textit{iteration} through the data, by ensuring that only -meaningful are left. +\textit{parsed}, meaning that only useful data will be extracted. This will in +turn significantly simplify \textit{iteration} through the data, by ensuring +that only meaningful are left. \smallskip \noindent -Lastly an important design choice has been creating an -array of \textit{pointers} to each coordinate pair. This design is explained in further +Lastly an important design choice has been creating an array of +\textit{pointers} to each coordinate pair. This design is explained in further detail later. \medskip @@ -124,7 +125,7 @@ passed as a \textit{command line argument}, otherwise a default algorithm will b \noindent This design where an algorithm can be passed to the sorting function, allows the -programmed to be extended with new algorithms if needed. +program to be extended with new algorithms if needed. \smallskip \noindent @@ -141,7 +142,7 @@ non-negative integers in the range 0 to 32767. \noindent Although limiting, this range is a design decision taken in order to avoid unnecessary complexity and keep focus -on the main objective of the project, efficient sorting. Including negative +on the main objective of the project; efficient sorting. Including negative numbers would for example require extra logic for parsing, comparison, converting back and forth to ASCII, and allocating memory. \smallskip @@ -158,22 +159,19 @@ This design decision is great for flexibility, and also supports future exceptio of the program. \subsection{Why Insertionsort and Quicksort?} -Insertionsort and Quicksort have been handpicked to research the trade-offs +Insertionsort and quicksort have been handpicked to research the trade-offs between algorithmic simplicity and computational efficiency. Furthermore, these algorithms are widely different in nature. \smallskip \noindent Insertionsort is an iterative algorithm, depending primarily on memory usage, -registers, and program flow, -\smallskip - -\noindent -while Quicksort is a recursive algorithm using the divide and conquer approach. -This is interesting, because Quicksort is expected to excel on large random -datasets, while Insertionsort is more suitable for small or nearly sorted -cases, which is another great motivation to understand what is really -happening when these algorithms are executed, and how the dataset affects them. +registers, and program flow, while quicksort is a recursive algorithm using the +\textit{divide and conquer} approach. This is interesting, because quicksort is +expected to excel on large random datasets, while insertionsort is more suitable +for small or nearly sorted cases, which is another great motivation to +understand what is really happening when these algorithms are executed, and how +the dataset affects them. \begin{wrapfigure}[15]{r}{0.5\textwidth} \centering @@ -216,32 +214,31 @@ happening when these algorithms are executed, and how the dataset affects them. \end{wrapfigure} \subsection{Sorting Format} In order to be a little more general, and make our program easily expandable to -different types of data, we wanted to create our own array datatype, that -allows us to sort something more abstract, with respect to a key in that -abstract object. +different types of data, we wanted to create our own array datatype, that allows +us to sort something more abstract, with respect to a key in that abstract +object. \smallskip \noindent -To do this we settled on an array of memory addresses, that -point to where the abstract objects are, see \autoref{fig:array}. +To do this we settled on an array of memory addresses, that point to where the +abstract objects are, see \autoref{fig:array}. \smallskip \noindent -These objects would in practice also be represented as arrays. -With this we can specify which index (key) in the sub-arrays we want to sort by, -and then when we swap elements in the array, -we simply swap the memory addresses around. +These objects would in practice also be represented as arrays. With this we can +specify which index (key) in the sub-arrays we want to sort by, and then when we +swap elements in the array, we simply swap the memory addresses around. \section{Implementation} -This four phase design has been converted into Assembly code using the x86\_64 -instruction set. Furthermore the design has been split up in modules, where each -module is a collection of closely related functions. +This four phase design has been converted into \textit{Assembly} code using the +\textit{x86\_64} instruction set. Furthermore the design has been split up in +modules, where each module is a collection of closely related functions. \smallskip \noindent -For example \textit{array\_maker.s} contains functions meant to parse data and convert it to -the array format. All functions in this program follow the x86 calling -conventions, with a few exceptions where all registers are preserved. +For example \texttt{array\_maker.s} contains functions meant to parse data and +convert it to the array format. All functions in this program follow the +\textit{x86 calling conventions}, which adding more features a lot easier. \smallskip \noindent @@ -278,7 +275,7 @@ array, and the number of elements. \smallskip \noindent -With this it then first calls a function +The first thing the function does is call another function called \texttt{create\_file\_buffer}, that reads the contents of the file and puts it in memory. It then passes this address, and the length of the file onto the function \texttt{make\_pairwise\_data}. @@ -294,10 +291,9 @@ $i^\text{th}$ pair in the array points to. \smallskip \noindent -The psudocode for this can be seen on -\autoref{fig:arraymaker}.\footnote{The assembly code is quite easy to interpret -from this psudocode, hence the psudocode instead of assembly. This also makes it -easier to understand.} +The psudocode for this can be seen on \autoref{fig:arraymaker}.\footnote{The +assembly code is quite easy to recreate from this psudocode, hence the psudocode +instead of assembly. This also makes it easier to convey its functionality.} \newpage @@ -306,16 +302,16 @@ easier to understand.} \caption{Pseudo code for quick-sort algorithm in c-style}\label{fig:qsort} \end{wrapfigure} \subsection{Quicksort} -The quicksort implementation uses \textbf{Hoare partition scheme}, -a \textit{two-way partitioning} approach two pointers scan from opposite ends -of the array toward the middle. +The quicksort implementation uses \textbf{Hoare partition scheme}, which is a +\textit{two-way partitioning} approach, where two pointers scan from opposite +ends of the array toward the middle. \smallskip \noindent -The left pointer advances rightward while -pointing at elements smaller than the \textit{pivot}, and the right pointer -advances leftward while pointing at elements larger than the \textit{pivot}. -When both pointers points at misplaced elements, they swap them. +The left pointer advances rightward while pointing at elements smaller than the +\textit{pivot}, and the right pointer advances leftward while pointing at +elements larger than the \textit{pivot}. When both pointers point to misplaced +elements, they swap them. \smallskip \noindent @@ -353,22 +349,22 @@ in the pointer-array. \caption{Table over numeric ASCII codes}\label{fig:ascii-table} \end{wrapfigure} -When converting each coordinate back into \textit{ASCII}, -we use the \textbf{Euclidean algorithm} to extract each digit, -but the result is still in binary. +When converting each coordinate back into \textit{ASCII}, we use the +\textbf{Euclidean algorithm} to extract each digit, but the result is still in +binary. \smallskip \noindent -To convert the binary digit to its \textit{ASCII} character, -we simply add the \textit{ASCII} code for \texttt{0}, -which is \texttt{48} in decimal and \texttt{30} in hexadecimal. -This works because each digit is in order in the \textit{ASCII Table}. +To convert the binary digit to its \textit{ASCII} character, we simply add the +\textit{ASCII} code for \texttt{0}, which is \texttt{48} in decimal and +\texttt{30} in hexadecimal. This works because each digit is in order, in the +\textit{ASCII Table}. \smallskip \noindent -But the algorithm has the downside of getting each digit -in \textit{reverse} order, means we need to start right side of the \textit{buffer} when -adding new digits, as to not \textit{overwrite} previous digits. +The algorithm downside is getting each digit in \textit{reverse} order, which +means we need to start right side of the \textit{buffer} when adding new digits, +as to not \textit{overwrite} previous digits. \smallskip \noindent @@ -377,8 +373,8 @@ So each digit needs to be shifted down to \textit{leftmost} position in memory. \noindent To do this we used \texttt{REP MOVSB} which is a string operation in the -\textit{x86 instructionset architecture} that performs a -\textit{repeated byte copy operation}. +\textit{x86 instructionset architecture} that performs a \textit{repeated byte +copy operation}. \noindent \begin{wrapfigure}[8]{l}{0.55\textwidth} @@ -398,28 +394,25 @@ To do this we used \texttt{REP MOVSB} which is a string operation in the \end{wrapfigure} \noindent -It copies data from the memory location pointed to by -the \textit{source} index register (\texttt{RSI}) to the location pointed to by -the \textit{destination} index register (\texttt{RDI}), -repeating this operation for a number of \textit{iterations} specified by -the count register (\texttt{RCX}). +It copies data from the memory location pointed to by the \textit{source} index +register (\texttt{RSI}) to the location pointed to by the \textit{destination} +index register (\texttt{RDI}), repeating this operation for a number of +\textit{iterations} specified by the count register (\texttt{RCX}). \vspace{1cm} \noindent \texttt{RSI} and \texttt{RDI} are automatically either \textit{incremented} or -\textit{decremented}, based on direction flag (\texttt{DF}); -when the direction flag is \textit{clear}, both pointers \textit{increment}, -and when it's \textit{set} both pointers \textit{decrement}. +\textit{decremented}, based on direction flag (\texttt{DF}); when the direction +flag is \textit{clear}, both pointers \textit{increment}, and when it's +\textit{set} both pointers \textit{decrement}. \smallskip \noindent -This \textit{instruction} is particularly useful for bulk memory operations -such as copying strings -as it encodes a potentially lengthy loop into a \textit{single} instruction, -thereby improving code density. +This \textit{instruction} is particularly useful for bulk memory operations such +as copying strings, as it encodes a potentially lengthy loop into a +\textit{single} instruction, thereby improving code density. \section{Evaluation} - To evaluate whether our program works, we created three \textit{shell scripts}, one for generating test data (\textit{generate\_test\_data.sh}), one for testing the validity of the output of the program (\textit{test.sh}) and one for testing @@ -427,7 +420,7 @@ the execution time of the program (\textit{benchmark.sh}). \smallskip \noindent -The script that generates test files, generates files of size $n$ for all +The script that generates test files, generates files of with the sizes $\{1000n\mid 0\leq n\leq 100\}$. The size is equivalent to the number of coordinates. For each $n$ we also create three different kinds of test files. One where all of the data is random, one where the data is already sorted, and @@ -438,15 +431,14 @@ This makes Quicksort roughly 2400 times faster than Insertionsort.} \smallskip \noindent -This way we -can see both the average case, and worst case of any algorithm we decide to use. -Now we can use our \textit{test.sh} script, which runs the program with each of -the generated test files for all sorting algorithms of the program, and compares -the output with the output from the \textit{GNU Core Utilities} program \textit{sort}. -\smallskip +This way we can see both the average case, and worst case of any algorithm we +decide to use. Now we can use our \textit{test.sh} script, which runs the +program with each of the generated test files for all sorting algorithms of the +program, and compares the output with the output from the \textit{GNU Core +Utilities} program \textit{sort}. -\noindent -If all of the tests pass, we can then run \textit{benchmark.sh}, which times +\subsection{Benchmarking} +Given all of the tests pass, we can then run \textit{benchmark.sh}, which times how long it takes for our program to run with each test file for each algorithm, and saves the result in a \textit{csv} file. The results of this \textit{csv} file, together with functions asymptotically equivalent to $\mathcal O(n)$, $\mathcal O(n\log @@ -498,7 +490,7 @@ n)$ and $\mathcal O(n^2)$, is shown on \autoref{fig:benchmarktime}. \end{axis} \end{tikzpicture} - \caption{Benchmarking data for timing from Insertionsort and Quicksort for + \caption{Benchmarking data for timing from insertionsort and quicksort for both random, sorted and reverse sorted data, plotted with a logarithmic y axis}\label{fig:benchmarktime} \end{figure} @@ -566,7 +558,7 @@ elements, matches - within a scalar - with $n\log n$. \end{axis} \end{tikzpicture} - \caption{Benchmarking data for comparisons from Insertionsort and Quicksort + \caption{Benchmarking data for comparisons from insertionsort and quicksort for both random, sorted and reverse sorted data. The y axis is logarithmic, and the reference lines are not asymptotic but exact.}\label{fig:benchmarkcomp} @@ -579,35 +571,31 @@ of highly repeated functions. \smallskip \noindent -A good example is our old generateOutput function, which was extremely slow. -This function has since been removed and exchanged with -a new approach using a buffer. The implementation looped over the sorted list and -used 4 write syscalls per coordinate: +A good example is our old \texttt{generateOutput} function, which was extremely +slow. This function has since been removed and exchanged with a new approach +using a buffer. The implementation looped over the sorted list and used 4 write +\textit{syscalls} per coordinate: one for printing the $x$-coordinate, one for +printing the $y$-coordinate, one for tab, and one for a newline character. Thus +4 million syscalls would be needed just to print a sorted list of 1 million +coordinates, which causes an enormous program overhead. \smallskip \noindent -One for printing x-coordinate, one for -printing y, one for tab, and one for a newline character. -Thus 4 million syscalls would be needed just to print a sorted list of 1 million coordinates, which causes -an enormous program overhead. -\smallskip - -\noindent -So the point is that even though the problem studied -is efficient sorting, correct usage of registers, memory and syscalls will greatly -affect and boost performance. +So the point is that even though the problem studied is efficient sorting, +correct usage of registers, memory and syscalls will greatly affect and boost +performance. \section{Conclusion} -In this project insertionsort and quicksort have been implemented in assembly x86 64 -with the goal of studying efficient sorting. Although the algorithms perform the same task, -they are quite very different. +In this project insertionsort and quicksort have been implemented in +\textit{x86_64 assembly} with the goal of studying efficient sorting. Although +the algorithms perform the same task, they are in fact very different. \smallskip \noindent -Firstly, from the benchmarks we have learned that the -algorithms are heavily affected by the dataset itself. Both algorithms have very -different run times on the sorted, reversed, and random datasets, despite the sets -having the same size. +Firstly, from the benchmarks we have learned that the algorithms are heavily +affected by the dataset itself. Both algorithms have very different run times on +the sorted, reversed, and random datasets, despite the sets having the same +size. \smallskip \noindent @@ -616,21 +604,15 @@ the same performance on reversed datasets, insertionsort is optimal on sorted or small datasets, and quicksort excels on random and very large datasets. Thus efficient sorting requires an efficient algorithm well-suited for the dataset. But it should also be mentioned that in most general cases with -random output quicksort will most likely outperform insertionsort. -\smallskip - -\noindent -Furthermore, from the testings we have learned that the stable tests used on Insertionsort -cannot be used on quicksort, since quicksort doesn’t preserve the order the inputs. -This means that the algorithms might produce different results even if used -on the same dataset. +random output quicksort will outperform insertionsort. \smallskip \noindent -Another important observation has been that although the algorithms -themselves determine the asymptotic run times, the concrete implementations also matter -for the total run times. Which means that memory usage, registers, stack management, and -syscalls should also be considered while trying to create a fast sorting algorithm. +Another important observation has been that although the algorithms themselves +determine the asymptotic run times, the concrete implementations also matter for +the total run times. Which means that memory usage, registers, stack management, +and syscalls should also be considered while trying to create a fast sorting +algorithm. \smallskip \noindent -- cgit v1.2.3-70-g09d2