Intro to Data Structures
C++ has some built-in methods of storing compound data in useful
ways, like arrays and structs. There are other methods of storing
data in unique ways that have useful applications in computer science.
The general concept of what a data container stores, along with
general operations we want on that data, is an example of an Abstract
Data Type. Note that a typical ADT does not mandate HOW the storage
is set up inside, but rather what the general concept is from the user's
point of view. Examples:
- A sequence (or list) is the idea of a container that
stores a sequence of data, in a specific order, possibly with duplicates.
For example, a list of test grades would be written down in some order,
with a first item, a second item, etc. and there could be multiple
students with the same test grade
- A set is a collection of data that does not have any
duplicates, and order may or may not matter. Picture a Venn Diagram --
are you inside the circle or outside? In the set of all students wearing
hats, you may be in the set if you are wearing a hat, but you are only in
it once
Sometimes these ADT concepts have mulitple possible implementations. So
the study of data structures involves looking at the various possible
implementations, along with the advantages and disadvantages of each. C++
classes allow a nice mechanism for implementing such data types so that
the interface (the public section) provides for the necessary operations,
and the actual implementation details are internal and
hidden. The implementation of the class can be made to handle
all of the storage details, along with any necessary dynamic allocation
and cleanup. Common data structures include: stacks, queues,
vectors, linked lists, trees, hash tables, sets, and others. We
will introduce a few of these concepts here.
Stacks
This is an ADT (abstract data type).
-
First In Last Out (FILO). Insertions and removals from "top" position
only
-
Analgoy - a stack of cafeteria trays. New trays placed on top.
Trays picked up from the top.
-
A stack class will have two primary operations:
-
push -- adds an item onto the top of the stack
-
pop -- removes the top item from the stack
-
Typical application areas include compilers, operating systems, handling
of program memory (nested function calls)
Queues
This is an ADT (abstract data type).
-
First In First Out (FIFO). Insertions at the "end" of the queue,
and removals from the "front" of the queue.
-
Analogy - waiting in line for a ride at an amusement park. Get in
line at the end. First come, first serve.
-
A queue class will have two primary operations:
-
enqueue -- adds an item into the queue (i.e. at the back of the
line)
-
dequeue -- removes an item from the queue (i.e. from the front of
the line).
-
Typical application areas include print job scheduling, operating systems
(process scheduling).
Vector
This is a specific implementation of the sequence/list ADT.
-
A data structure that stores items of the same type, and is based on storage
in an array
-
By encapsulating an array into a class (a vector class), we can
-
use dynamic allocation to allow the internal array to be flexible in size
-
handle boundary issues of the array (error checking for out-of-bounds indices).
-
Advantages: Random access - i.e. quick locating of data if the index
is known.
-
Disadvantages: Inserts and Deletes are typically slow, since they
may require shifting many elements to consecutive array slots
Linked List
This is a specific implementation of the sequence/list ADT.
-
A collection of data items linked together with pointers, lined up "in
a row". Typically a list of data of the same type, like an array,
but storage is arranged differently.
-
Made up of a collection of "nodes", which are created from a self-referential
class (or struct).
-
Self-referential class: a class whose member data contains at least
one pointer that points to an object of the same class type.
-
Each node contains a piece of data, and a pointer to the next node.
-
Nodes can be anywhere in memory (not restricted to consecutive slots, like
in an array).
-
Nodes generally allocated dynamically, so a linked list can grow to any
size, theoretically (within the boundaries of the program's memory).
-
An alternative to array-based storage.
-
Advantages: Inserts and Deletes are typically fast. Require
only creation of a new node, and changing of a few pointers.
-
Disadvantage: No random access. Possible to build indexing
into a linked list class, but locating an element requires walking through
the list.
-
Notice that the advantages of the array (vector) are generally the disadvantages
of the linked list, and vice versa
Note: Some abstract types, like Stacks and Queues, can be
implemented with a vector or with a linked list. A stack can
use a linked list as its underlying storage mechanism, for instance, but
would limit the access to the list to just the "push" and "pop" concepts
(insert and remove from one end).
Code Examples:
Some code examples from Deitel (previously used textbook)
These examples are all implemented as class templates.
- Fig 21_3-5: An example of a Linked List class.
- Fig 21_13-14: An example of a Stack class, implemented
with inheritance using the linked list class. Stack is derived from
List.
- Fig 21_15: An example of a Stack class, implemented
with composition, using the linked list class. Stack contains a List
object as member data.
- Fig 21_16-17: An example of a Queue class, derived from
List.
- Fig 21_20-22: An example of a Tree class. This
implementation is a binary tree (each node has two pointers to other
Nodes).