Data structure problems
There are a large number of possible data structures that are generally
useful. These include lists (providing sequential access to a collection
of values), vectors (providing random access to a collection of values),
maps (associative arrays, providing efficent access to an item specified
by a key), sets (unordered collections of unique values), and so on.
The problems here are intended as a way of focussing attention on the
range of applications that the data structures may be used in, and as
a gauge for checking that any proposals are capable of dealing with a
range of uses. Suggestions for additional problems are welcome!
- Generate a list of the ten commonest words in a file. This requires a
mapping of string keys to integer values, and the ability to extract
the contents into a list of (string,integer) pairs which can be
sorted based on the integer component of the pair.
- Given a tree representing a hierarchy of windows in a GUI, build a
list of "pointers" to selected tree nodes (e.g. all buttons in the
hierarchy) so that it can be traversed to selectively update that
particular set of windows (e.g. by changing the font used for the
text of each button).
This requires the ability to build arbitrary tree structures and to
keep "pointers" to individual nodes outside the tree itself.
- Implement an efficient sorting algorithm (e.g. quicksort) for a
collection of values. This requires that the collection should
allow for adequately efficient sorting without the need for a
special "sort" primitive. The algorithm should be applicable to
a variety of data structures (all proposed data structures?) without
any changes. This requires some common protocol for accessing items
in different data structures.
- Generate the set of all possible permutations of a list of values.
This requires the ability to recursively permute sublists of the
original list.
- Build a directed graph to represent an non-deterministic finite
state automaton (NFA). This requires that the number of edges leading
from a particular node can be made arbitrarily large, and the number
of edges leading to a particular node can also be made arbitrarily
large. Note that matching using an NFA requires sets of states to be
manipulated.
- Implement a stack using another more general data structure (e.g. a
list), and use this together with a directed graph representing a maze
to find a route through the maze.
Some related design issues:
- Should controlled types be used? If so, automatic storage reclamation
as part of finalization is possible, but generic packages using controlled
types can only be instatiated at library level.
- Should it be possible to design algorithms which can be applied equally
well to any data structure (although there may be a performance difference
between particular data structures)?
- Should structures such as stacks, queues, deques and so on be provided,
or should a library focus on types such as lists which can be used to
build such specialised types?
- Should common algorithms like those in the C++ STL be provided (partition,
merge, and so on)?
- What additional data types (other than data collections) would be useful,
such as reference-counted objects?