aboutsummaryrefslogtreecommitdiff
path: root/report
diff options
context:
space:
mode:
authorAndreas Kapp Lindquist <andkaplin05@gmail.com>2025-10-27 10:38:01 +0100
committermithe24 <mithe24@student.sdu.dk>2025-10-29 13:49:57 +0100
commit6e82c40715c3dacec01f2ecc8cf15ac4c2907f66 (patch)
tree97a54963ba59da772e714dc71527c3667cc97a3f /report
parentc4d95bde5fcda18b19766fe8ec2912c1ebede4e4 (diff)
downloadsorter-6e82c40715c3dacec01f2ecc8cf15ac4c2907f66.tar.gz
sorter-6e82c40715c3dacec01f2ecc8cf15ac4c2907f66.zip
report(*): Made lines fit 80 characters, and word correction
Diffstat (limited to '')
-rw-r--r--report/report.tex134
1 files changed, 77 insertions, 57 deletions
diff --git a/report/report.tex b/report/report.tex
index ea6822b..e08cc7f 100644
--- a/report/report.tex
+++ b/report/report.tex
@@ -15,7 +15,7 @@
\begin{center}
\Huge
- Sorting in x86\_64 linux Assembly
+ Sorting in x86\_64 Linux Assembly
\vspace{1cm}
\Large
@@ -37,34 +37,45 @@
\newpage
\section{Introduction}
-Sorting is one of the most fundamental operations in programming. Sorting algorithms play a crucial role
-in efficient searching, organization and interpretation of data, and optimization problems. Furthermore,
-sorting is often times the first step towards designing complicated algorithms in computer science. This
-project aims to study efficient sorting of a list of coordinates by the second value of each coordinate,
-in Assembly. Although abstract high level languages such as Python and Java allow faster and convenient development,
-Assembly provides precise control over program flow, memory usage and instruction level execution, making Assembly an excellent
-evironment for benchmarking and performance analysis. In this project the two algorithms quick-sort and insertion-sort
-will be implemented and compared against each other. Quick-sort is expected to excel on large random datasets,
-while insertion-sort is more suitable for small or nearly sorted cases.
-Several test
-instances consisting of uniformly random coordinates will be created with the goal of comparing the implementations
-actual runtime to their theoretical runtimes and evaluate each implementation.
+Sorting is one of the most fundamental operations in programming. Sorting
+algorithms play a crucial role in efficient searching, organization and
+interpretation of data, and optimization problems. Furthermore, sorting is often
+times the first step towards designing complicated algorithms in computer
+science. This project aims to study efficient sorting of a list of coordinates
+by the second value of each coordinate, in Assembly. Although abstract high
+level languages such as Python and Java allow faster and convenient development,
+Assembly provides precise control over program flow, memory usage and
+instruction level execution, making Assembly an excellent environment for
+benchmarking and performance analysis. In this project the two algorithms
+quick-sort and insertion-sort will be implemented and compared against each
+other. Quick-sort is expected to excel on large random datasets, while
+insertion-sort is more suitable for small or nearly sorted cases. Several test
+instances consisting of uniformly random coordinates will be created with the
+goal of comparing the implementations actual runtime to their theoretical
+runtimes and evaluate each implementation.
\section{Design}
\subsection{How are the input file and output formatted?}
-Each line contains one and only one coordinate consisting of two values formatted as \texttt{$\langle$x$\rangle$\textbackslash t$\langle$y$\rangle$\textbackslash n},
-where x and y are non-negative integers in the range 0 to 32767. Although limitting, this range is a design desicion taken in order to avoid unnecessary complexity and
-keep focus on the main objective of the project, efficient sorting. Including negative numbers would for example require extra logic for parsing, comparision, converting back and forth to ASCII,
-and allocating memory. Both input and output follow this strict format.
+Each line contains one and only one coordinate consisting of two values
+formatted as \texttt{$\langle$x$\rangle$\textbackslash
+t$\langle$y$\rangle$\textbackslash n}, where $x$ and $y$ are non-negative
+integers in the range 0 to 32767. Although limiting, this range is a design
+decision taken in order to avoid unnecessary complexity and keep focus on the
+main objective of the project, efficient sorting. Including negative numbers
+would for example require extra logic for parsing, comparison, converting back
+and forth to ASCII, and allocating memory. Both input and output follow this
+strict format.
\subsection{Program Flow}
-The program is structured into four main phases: reading input, parsing data, sorting, and generating output. Each phase is
-designed to handle a specific task, which creates modulariy and clarity. Whitin each phase a set of functions manage the corresponding
-operations, allowing the programmer to focus on one task at a time. This design also greatly supports debugging
-by simplifying tracking and solving errors. Below is a flowchart visualizing each step, followed by
-a short summary of the program.
+The program is structured into four main phases: reading input, parsing data,
+sorting, and generating output. Each phase is designed to handle a specific
+task, which creates modularity and clarity. Within each phase a set of functions
+manage the corresponding operations, allowing the programmer to focus on one
+task at a time. This design also greatly supports debugging by simplifying
+tracking and solving errors. Below is a flowchart visualizing each step,
+followed by a short summary of the program.
\begin{figure}[H]
\centering
@@ -73,37 +84,43 @@ a short summary of the program.
\label{fig:flowchart}
\end{figure}
-\textbf{Reading Input:} The first step of the program is to find the input file, use an open syscall, calculate the size of the file using a function
-called \textit{getFileSize}, allocating memory, and saving the coordinates.
+\textbf{Reading Input:} The first step of the program is to find the input file,
+use an open syscall, calculate the size of the file using a function called
+\textit{getFileSize}, allocating memory, and saving the coordinates.
-\textbf{Parsing the data:} Next, the received file is to be parsed, meaning that only useful data will be extracted. This process omits
-the tabs, and newlines, which in turn significantly simplifies iteration through the data, by ensuring that only 8-byte coordinates are left.
-Lastly an important design choice has been creating an array of pointers to each coordinatepair.
-This design is explained in further detail later. All of this logic has been wrapped in a function \textit{make\_array\_from\_file}, which
-utilizes several smaller helper functions to perform this parsing process.
+\textbf{Parsing the data:} Next, the received file is to be parsed, meaning that
+only useful data will be extracted. This process omits the tabs, and newlines,
+which in turn significantly simplifies iteration through the data, by ensuring
+that only 8-byte coordinates are left. Lastly an important design choice has
+been creating an array of pointers to each coordinate pair. This design is
+explained in further detail later. All of this logic has been wrapped in a
+function \textit{make\_array\_from\_file}, which utilizes several smaller helper
+functions to perform this parsing process.
-\textbf{Sorting:}
-After parsing, a sorting algorithm of choice can be used on the coordinates. The currently implemented sorting algorithms are
-quick-sort and insertion-sort. Both algorithms have been designed to allow sorting both on the x-coordinate and the y-coordinate. This
-is easily achievable because of the array of pointers. The chosen algorithm can be passed as a command line argument, otherwise
-a default algorithm will be run. This design where an algorithm can be passed to the sorting function, allows the programmed to be
+\textbf{Sorting:} After parsing, a sorting algorithm of choice can be used on
+the coordinates. The currently implemented sorting algorithms are quick-sort and
+insertion-sort. Both algorithms have been designed to allow sorting both on the
+x-coordinate and the y-coordinate. This is easily achievable because of the
+array of pointers. The chosen algorithm can be passed as a command line
+argument, otherwise a default algorithm will be run. This design where an
+algorithm can be passed to the sorting function, allows the programmed to be
extended with new algorithms if needed.
-\textbf{Generating output:} A side effect of the \textit{make\_array\_from\_file} is converting ASCII characters
-to 8-byte integers. This allows the sorting algorithm to make numeric comparisons on the data. But this also means that the data should
-be converted back to ASCII, before printing the output.
+\textbf{Generating output:} A side effect of the
+\textit{make\_array\_from\_file} is converting ASCII characters to 8-byte
+integers. This allows the sorting algorithm to make numeric comparisons on the
+data. But this also means that the data should be converted back to ASCII,
+before printing the output.
\subsection{Why an array of pointers?}
-In order to be a litte more general, and make our program easily expandable to
+In order to be a little more general, and make our program easily expandable to
different types of data, we wanted to create out own array datatype, that
-allowes us to sort something more abstract, with respect to a key in that
-abstract object.
-To do this we settled on an array of memory addresses, that point to where the
-abstract objects are, see \autoref{fig:array}. These object would in practice
-also be represented as
-arrays. With this we can specify which index (key) in the sub-arrays we want to
-sort by, and then when we swap elements in the array, we simply swap the memory
-addresses around.
+allows us to sort something more abstract, with respect to a key in that
+abstract object. To do this we settled on an array of memory addresses, that
+point to where the abstract objects are, see \autoref{fig:array}. These object
+would in practice also be represented as arrays. With this we can specify which
+index (key) in the sub-arrays we want to sort by, and then when we swap elements
+in the array, we simply swap the memory addresses around.
\begin{figure}[H]
\centering
@@ -146,22 +163,25 @@ addresses around.
\end{figure}
\section{Implementation}
-This four phase design has been converted into Assembly code using the x86 64 instruction set.
-Furthermore the design has been split up in modules, where each module is a collection of closely
-related functions. For example \textit{Array\_maker.s} contains functions meant to parse data.
-All functions in this program follow the x86 calling conventions, with a few exceptions
-where all registers are preserved. Below is a detailed explanation of some of the more
+This four phase design has been converted into Assembly code using the x86\_64
+instruction set. Furthermore the design has been split up in modules, where each
+module is a collection of closely related functions. For example
+\textit{array\_maker.s} contains functions meant to parse data. All functions in
+this program follow the x86 calling conventions, with a few exceptions where all
+registers are preserved. Below is a detailed explanation of some of the more
interesting functions.
\subsection{Insertion-sort}
-This is a classic implementation of the iterative insertion-sort. The function takes three arguments:
-address of the array of pointers to each coordinate, number of coordinates and which coordinate dimension to sort by.
-The first step of this algorithm is to check the number of coordinates with a compare statement, if n $\le$ 1,
-the array is already sorted and nothing should be done. Otherwise the algorithm initializes two indices i and j,
-representing the outer and inner loop in insertion-sort. A snippet of the algorithm is seen below.
+This is a classic implementation of the iterative insertion-sort. The function
+takes three arguments: address of the array of pointers to each coordinate,
+number of coordinates and which coordinate dimension to sort by. The first step
+of this algorithm is to check the number of coordinates with a compare
+statement, if n $\le$ 1, the array is already sorted and nothing should be done.
+Otherwise the algorithm initializes two indices i and j, representing the outer
+and inner loop in insertion-sort. A snippet of the algorithm is seen below.
\begin{algorithm}[H]
-\caption{Isertion-sort inner Loop}
+\caption{Insertion-sort inner Loop}
\SetAlgoNoLine
\SetKw{KwCmp}{cmpq}
\SetKw{KwJl}{jl}