Last time we saw how to write a Vector structure that behaves a lot like an array but whose size can change. Vectors work fairly well as long as they are either relatively short or we usually add elements at the end. If we add elements near the beginning of a vector, we need to move all the elements that follow it to make room. If you vector is big, this will take time. If most times we add elements at the end, this is not a problem.
If we want to find an element, we need to walk through the vector one element at a time. If the element is not in the vector we'll need to look at each element in the vector to determine this. Again, if the vector gets big, this will take time.
One solution might be to keep the elements in the vector sorted, but to do that will require more frequent insertions into the middle of a vector. Catch-22.
So, vectors are not the best structure for large collections where we care about the order of the elements or where we do a lot of searching for specific elements.
A linked list is another structure we can use to hold collections of values and whose size can grow and shrink dynamically. A linked list uses pointers heavily. Instead of having an array of values, we chain together the values with pointers. To put a value into a linked list, we create a "node". Inside the node is the value and a pointer to the next node in the list:
typedef struct node { int value; struct node *next; } Node;
Two points about this declaration. First, notice that we said "typedef struct node" instead of just "typedef struct". That was so that we could declare the next pointer inside the struct. We couldn't declare next to be "Node *" because Node is not yet defined when we reach the declaration of next. In C everything must be defined earlier in the file than where it was used and this little trick works nicely for structs.
The second point is to notice that we declared our value to be an int. We could have made it a void * and created a linked list that held similar values to our vector from yesterday. However, normally when we use linked lists, it is because we want to maintain an ordering based on the values. Often, the values in the linked list are sorted. Therefore, we need a function to compare the values in two nodes. If the types of the values are void *, we wouldn't know what we were comparing so we wouldn't know what function to call to compare the values. Therefore, we need to reimplement linked lists for each type of value we want to place in them. (C++ offers templates and function overloading to get around this problem.)
A list is just a chain of these nodes linked together. To manipulate a list, we just need to remember the first node in the chain:
typedef struct { Node *head; } LinkedList;
To create a linked list, we just need to allocate memory for the list struct and initialize its fields:
LinkedList *newLinkedList () { LinkedList *new = malloc (sizeof (LinkedList)); new->head = NULL; return new; }
To insert a new node in the list, we first must find the right location for the node so that the values are in sorted order. After we have done that we must update 2 pointers: the next pointer of the new node as well as the pointer pointing at the new node.
void insert (LinkedList *l, int value) { /* The node that should precede the new node in the list. */ Node *previous = NULL; /* The node we are currently comparing to our new value */ Node *current = l->head; /* The new node */ Node *new = newNode (value); /* Find the nodes that should precede and follow the new node. */ while (current != NULL && current->value < value) { previous = current; current = current->next; } /* Special case the situation where we are adding the first node to the list */ if (previous == NULL) { l->head = new; } else { previous->next = new; } /* Set the node that follows the new node. */ new->next = current; }
To remove a node from the list, we only need to update one pointer, the pointer that points at the node we are removing. We should also remember to free the memory associated with the node that we remove.
void removeValue (LinkedList *l, int value) { /* The node that precedes the node we are removing */ Node *previous = NULL; /* The node we are examining */ Node *current = l->head; /* Keep looking until there are no more nodes or we have found or gone past the value we are looking for. */ while (current != NULL && current ->value < value) { previous = current; current = current->next; } /* If we reached the end of the list or we found a value higher than what we wanted, then the value is not in the list */ if (current == NULL || current->value > value) { printf ("%d not found.\n", value); return; } if (previous == NULL) { /* Removing the head of the list. */ l->head = current->next; } else { /* Remove the current element by having its predecessor skip it. */ previous ->next = current->next; } /* Free the memory associated with the node we just removed. */ free (current); } }
To free a list, we must first free all the nodes in the list and then the list itself.
The full code for the linked list example can be found in /home/facutly/freund/shared/cs010/list.cvoid freeLinkedList (LinkedList *l) { Node *current; for (current = l->head; current != NULL; current = current->next) { free (current); } free (l); } Node *newNode (int value) { Node *new = malloc (sizeof (Node)); new->value = value; new->next = NULL; return new; }
Adding elements early in a linked list is cheaper than in a Vector since we do not need to move anything to make room for the new element.
If we search for a value, on average we need to search half the list to find the value or to determine that it is not in a list so this is better than an unsorted vector.
There is no direct access to elements given an index as there is with vectors. To find an element by position, we need to start at the beginning of the list and count nodes as we walk the list.
Adding nodes towards the end of a linked list is expensive because we need to walk the list to get to the spot where the addition will occur.
Linked lists often require more memory than Vectors. Each value in a linked list requires additional memory to hold a pointer. In a Vector, there may be wasted space due to unused capacity. The closer our vector is to full, the less wasted space there is.