Description
Overview
The purpose of this assignment is for you to write a data structure called a Linked List, which utilizes templates (similar to Java’s generics), in order to store any type of data. In addition, the nature of a Linked List will give you some experience dealing with non-contiguous memory organization.
This will also give you more experience using pointers and memory management. Pointers, memory allocation, and an understanding how data is stored in memory will serve you well in a variety of situations, not just for an assignment like this.
2 Background
2.1 Remember Memory? Variables, functions, pointers… everything takes up SOME space in memory. Sometimes that memory is occupied for only a short duration (a temporary variable in a function), sometimes that memory is allocated at the start of a program and hangs around for the lifetime of that program. Visualizing memory can be difficult sometimes, but very helpful.
You may see diagrams of memory like this: x y z car r g b a ptr Other memory 1 int x, y, z; 2 Vehicle car; 3 char r, g, b, a; 4 Vehicle *ptr = &car; There are other ways to draw this and if you are trying to draw out some representation of memory to help you solve a problem, a visualization that makes sense to you will be best. This pdf will specifically use this horizontal representation, but the same concepts used here will apply to any other representation! 3
2.2 Arrays 2 BACKGROUND 2.2 Arrays Arrays are stored in what is called contiguous memory. A contiguous memory block is one that is not interrupted by anything else (i.e. any other memory block). So if you created an array of 5 integers, each of those integers would be located one after the other in memory, with nothing else occupying memory between them.
This is true for all arrays, of any data type. 1 int someArray[5]; 2 someArray[0] = 2; 3 someArray[1] = 4; 4 someArray[2] = 6; 5 someArray[3] = 8; 6 someArray[4] = 10; Other memory 2 4 6 8 10 Other memory someArray All of the data in an application is not guaranteed to be contiguous, nor does it need to be. Arrays are typically the simplest and fastest way to store data, but they have a grand total of zero features.
You allocate one contiguous block, but you can’t resize it, removing elements is a pain (and slow), etc. Consider the previous array. What if you wanted to add another element to that block of memory? If the surrounding memory is occupied, you can’t simply overwrite that with your new data element and expect good results. (this may look familiar!) 1 someArray[5] = 12; // bad idea Other Memory 2 4 6 8 10 Someone else’s memory
This will almost certainly break. In this scenario, in order to store one more element you would have to: 1. Create another array that was large enough to store all of the old elements plus the new one 2. Copy over all of the data elements one at a time (including the new element, at the end) 3. Free up the old array—no point in having two copies of the data Other Memory 2 4 6 8 10 Someone else’s memory Old Array —Uninitialized values— Create New Array Other Memory 2 4 6 8 10 Someone else’s memory 2 4 … … … 12 Copy old data to the new array And the “new guy” Other Memory Newly available memory Someone else’s memory 2 4 6 8 10 12 Lastly, delete the old array This process has to be repeated each time you want to add to the array (either at the end, or insert in the middle), or remove anything from the array. It can be quite costly, in terms of performance, to delete/rebuild an entire array every time you want to make a single change. Cue the Linked List!
4 2.3 Linked List 2 BACKGROUND 2.3 Linked List The basic concept behind a Linked List is simple: 1. It’s a container that stores its elements in a non-contiguous fashion 2. Each element knows about the location of the element which comes after it (and possibly before, more on that later) So instead of a contiguous array, where element 4 comes after element 3, which comes after element 2, etc. . . you might have something like this: … First … Fourth Second … Third … nullptr Each element in the Linked List (typically referred to as a “node”) stores some data, plus some sort of reference (a pointer, in C++) to whatever node should come next.
The First node knows only about itself, and the Second node. The Second node knows only about itself, and the Third, etc. In this example the Fourth node has a null pointer as its “next” node, indicating that we’ve reached the end of the data. A real-world example can be helpful as well: Think about a line of people, with one person at the front of the line.
That person might know about the person who is next in line, but no further than that (beyond him or herself, the person at the front doesn’t need to know or care). The second person in line might know about the third person in line, but no further. Continuing on this way, the last person in line knows that there is no one else that follows, so that must be the end. Next == null, must be at the end etc… First in line So. . . What are the advantages of storing data like this? When inserting or removing elements into an array, the entire array has to be reallocated. With a Linked List, only a small number of elements are affected. Only elements surrounding the changed element need to be updated, and all other elements can remain unaffected. This makes the Linked List much more efficient when it comes to adding or removing elements. Now, imagine one person wants to step out of line. If this were an array, all of the data would have to be reconstructed elsewhere. In a Linked List, only three nodes are affected: • the person leaving • the person in front of that person • the person behind that person. Imagine you are the person at the front of the line. You don’t really need to know or care what happens 10 people behind you, as that has no impact on you whatsoever.
5 2.4 Terminology 2 BACKGROUND leaver If the 5 th person in line leaves, the only parts of the line that should be impacted are the 4 th , 5 th, and 6 th spaces. 1. Person 4 has a new “next” Person: whomever was behind the person behind them (Person 6). 2. Person 5 has to be removed from the list. 3. Person 6… actually does nothing. In this example, a Person only cares about whomever comes after them. Since Person 5 was before Person 6, Person 6 is unaffected. (A Linked List could be implemented with two-way information between nodes—more on that later). The same thought-process can be applied if someone stepped into line (maybe a friend was holding their place) New Guy In this case, Person 2 would change their “next” person from Person 3, to the new Person being added. New Guy would have his “next” pointer set to whomever Person 2 was previously keeping track of, Person
3. Because of the ordering process, Person 3 would remain unchanged, as would anyone else in the list (aside from being a bit irritated at the New Guy for cutting in line). So that’s the concept behind a Linked List. A series of “nodes” which are connected via pointer to one another, and inserting/deleting nodes is a faster process than deleting and reconstructing the entire collection of data. Now, how to go about creating that? 2.4 Terminology Node The fundamental building block of a Linked List. A node contains the data actually being stored in the list, as well as 1 or more pointers to other nodes. This is typically implemented as a nested class (see below).
Singlylinked A Linked List would be singly-linked if each node only has a single pointer to another node, typically a “next” pointer. This only allows for uni-directional traversal of the data—from beginning to end. Doublylinked Each node contains 2 pointers: a “next” pointer and a “previous” pointer. This allows for bi-directional traversal of the data—either from front-to-back or back-to-front. Head A pointer to the first node in the list, akin to index 0 of an array. Tail A pointer to the last node in the list. May implementation of the list.
6 2.5 Nested Classes 3 BENEFITS AND DRAWBACKS 2.5 Nested Classes The purpose of writing a class is to group data and functionality. The purpose of a nested class is the same—the only difference is where we declare a nested class. We declare a nested class like this: 1 class MyClass{ 2 public: 3 // Nested class 4 struct NestedClass 5 { 6 int x, y, z; 7 int SomeFunction(); 8 }; 9 private: 10 // Data for “MyClass” 11 NestedClass *somePtr; 12 NestedClass myData[5]; 13 float values[10]; 14 // Etc… 15 }; 1 // To create nested classes… 2 // Use the Scope Resolution Operator 3 MyClass::NestedClass SomeVariable; 4 5 // With a class template… 6 TemplateClass::Nested bar; 7 8 //
NOTE 1: You can make nested classes private if you 9 //wish, to limit access to them. 10 11 // Why is NestedClass a struct in this case? 12 // No reason in particular. Remember a struct is just a 13 //class with public access by default. 14 15 // NOTE 2: Nested classes and templates can make for 16 // ugly code. Be sure to read the section on Templates! Additional reading: cppreference The nature of the Linked List is that each piece of information knows about the information which follows (or precedes) it. It would make sense, then, to create some nested class to group all of that information together.
A sample hierarchy of how that would work: 1 LinkedListClass { 2 NestedNodeClass { 3 // The data you’re storing 4 // A pointer to the next node 5 // A pointer to the prev node (if doubly) 6 }; 7 // A node pointer to head 8 // A node pointer to tail (if doubly) 9 // How many nodes are there? 10 }; 3 Benefits and Drawbacks All data structures in programming (C++ or otherwise) have advantages and disadvantages. There is no “one size fits all” data structure. Some are faster (in some cases), some have smaller memory footprints, and some are more flexible in their functionality, which can make life easier for the programmer. Array Linked List Fast access of individual elements as well as iteration over the entire array Changing the Linked List is fast – nodes can be inserted/removed very quickly Random access – You can quickly “jump” to the appropriate memory location of an element Less affected by memory fragmentation, nodes can fit anywhere in memory Changing the array is slow – Have to rebuild the entire array when adding/removing elements No random access, slow iteration and access to individual elements Memory fragmentation can be an issue for arrays—need a single, contiguous block large enough for all of the data Extra memory overhead for nodes/pointers
7 4 CODE STRUCTURE 4 Code Structure As shown above, the Linked List class itself stores very little data: Pointers to the first and last nodes, and a count. In some implementations, you might only have a pointer to the first node, and that’s it. In addition to those data members, your Linked List class must conform to the following specification! (Yes it’s a lot… it’s a big project get started early!) 4.1 Construction / Destruction Default Constructor Default constructor. How many nodes in an empty list? (Answer: 0) What is head pointing to? What is tail pointing to? (Answer: nullptr) Initialize your variables! Copy Constructor Sets “this” to a copy of the passed in LinkedList. For example, if the other list has 10 nodes, with values of 1-10? “this” should have a copy of that same data. Destructor The usual. Clean up your mess. (Delete all the nodes created by the list.)
4.2 Behaviors PrintForward Iterator through all of the nodes and print out their values, one a time. PrintReverse Exactly the same as PrintForward, except completely the opposite. PrintForwardRecursive This function takes in a pointer to a Node—a starting node. From that node, recursively visit each node that follows, in forward order, and print their values. This function MUST be implemented using recursion, or tests using it will be worth no points. Check your textbook for a reference on recursion. PrintReverseRecursive Same deal as PrintForwardRecursive, but in reverse.
4.3 Accessors NodeCount How many things are stored in this list? FindAll Find all nodes which match the passed in parameter value, and store a pointer to that node in the passed in vector. Use of a parameter like this (passing a something in by reference, and storing data for later use) is called an output parameter. Find Find the first node with a data value matching the passed in parameter, returning a pointer to that node. Returns nullptr if no matching node found. GetNode Given an index, return a pointer to the node at that index. Throws an exception of type out_of_range if the index is out of range. Const and non-const versions. Head Returns the head pointer. Const and non-const versions. Tail Returns the tail pointer. Const and non-const versions.
8 4.4 Insertions 4 CODE STRUCTURE 4.4 Insertions AddHead Create a new Node at the front of the list to store the passed in parameter. AddTail Create a new Node at the end of the list to store the passed in parameter. AddNodesHead Given an array of values, insert a node for each of those at the beginning list, maintaining the original order. AddNodesTail Ditto, except adding to the end of the list. InsertAfter Given a pointer to a node, create a new node to store the passed in value, after the indicated node. InsertBefore Ditto, except insert the new node before the indicated node. InsertAt Inserts a new Node to store the first parameter, at the index-th location.
So if you specified 3 as the index, the new Node should have 3 Nodes before it. Throws an out_of_range exception if given an invalid index. 4.5 Removals RemoveHead Deletes the first Node in the list. Returns whether or not the Node was removed. RemoveTail Deletes the last Node, returning whether or not the operation was successful. Remove Remove ALL Nodes containing values matching that of the passed-in Returns how many instances were removed. RemoveAt Deletes the index-th Node from the list, returning whether or not the operation was successful. Clear Deletes all Nodes. Don’t forget the node count—how after you deleted all of them?
4.6 Operators operator[] Overloaded subscript operator. Takes an index, and returns data from the indexth node. Throws an out_of_range exception for an invalid index. Const and nonconst versions. operator= Assignment operator. After listA = listB, listA == listB is true. Can of your existing functions to make write this one? (Hint: Yes you can.) operator== Overloaded equality operator. Given listA and listB, is listA equal to listB? What would make one Linked List equal to another? If each of its nodes were equal to the corresponding node of the other. (Similar to comparing two arrays, just with non-contiguous data).
9 5 TEMPLATE TYPES 5 Template types The compilation process for templates requires specialization-that is, your compiler essentially copies and-pastes a version of your template, replacing all the instances of with the type you specified when creating an instance of the class. When dealing with templates within templates, nested template classes, etc… the compiler sometimes needs a bit of assistance when figuring out a type. In order to know how to specialize, it has to know everything about the template class. If that template has some other template, it needs to know everything… before it knows everything… and there lies the problem.
The typename keyword is a way of telling your compiler “Hey, what immediately follows typename is a data type. Treat it as such when you compiler everything else.” For example: 1 template 2 class Foo { 3 public: 4 struct NestedData { 5 T someThing; 6 }; 7 NestedData SomeFunction(); 8 }; When defining the function “SomeFunction” you might write this: 1 template 2 Nested Foo::SomeFunction() // Error, what is a “Nested”? You could clean that up by specifying that a Nested object is part of the Foo class: 1 template 2 Foo::Nested Foo::SomeFunction() // Error, Foo is a template class, label it as such Okay. . . how about this? 1 template 2 Foo::Nested Foo::SomeFunction() //
This SHOULD work, but… still doesn’t. Why? When the Foo class is being defined, it is referencing a type, Nested. This type is part of the Foo class, but the compiler doesn’t know it’s part of the class until it’s done defining the class. . . But, since Foo is in the process of being defined. . . how can something simultaneously be defined not-yet-defined? Answer: It can’t. Solution: using the typename keyword, tell the compiler that Nested IS IN FACT A TYPE, and so the compiler doesn’t have to wait to fully define Foo before finding out what it is. 1 // typename: Yes, compiler, Foo::Nested is a type. Honest. You’ll realize it later. 2 template 3 typename Foo::Nested Foo::SomeFunction() Ugly? You bet! Part of the joy of working with templates? It sure is!
Every programming language has quirks like this that you just have to learn over time. 10 6 TIPS 6 Tips A few tips for this assignment: • Start small! Work on one bit of functionality at a time. Work on things like Add() and PrintForward() first, as well as accessors (brackets operator, Head()/Tail(), etc). You can’t really test anything else unless those are working. • Your output is simple: print the data, print newline •
Remember the “Big Three” or the “Rule of Three” – If you define one of the three special functions (copy constructor, assignment operator, or destructor), you should define the other two • Refer back to the recommended chapters in your textbook as well as lecture videos for an explanation of the details of dynamic memory allocation – There are a lot of things to remember when dealing with memory allocation • Make charts, diagrams, sketches of the problem. Memory is inherently difficult to visualize, find a way that works for you. • Don’t forget your node count!