Collections
A collection is something that holds zero or more elements in some fashion that allows you enumerate those elements, add or remove elements, find them and so forth.
Vector - a dynamic array. Appending or removing elements from the end is cheap (providing the array is large enough to accomodate an additional item). Inserting items or removing them from any other part of the array is more expensive and involves memory movements. Generally speaking you should always reserve enough space in a vector for the most elements you anticipate it will hold. Reallocating memory can be expensive and lead to fragmentation.
Vecdeque - a ring buffer array. Items can be added or removed from either end relatively cheaply. Items in the array are not arranged sequentially so there is a little more complexity to managing wraparound and removal than a Vector.
LinkedList - a linked list individually allocates memory for each element making it cheap to add or remove elements from anywhere in the list. However there is a lot more overhead to iterating the list by index and much more heap allocation.
Set - a collection that holds a unique set of items. Inserting a duplicate item will not succeed. Some sets maintain the order of insertion. Sets are useful where you want to weed out duplicates from an input.
Map - a collection where each item is referenced by a unique key. Some maps can maintain the order of insertion.
C++ and Rust have have collections as part of their standard library as is common with modern languages.
C | C++ | Rust |
---|---|---|
- | std::vector |
std::vec::Vec or std::collections::VecDeque |
- | std::list |
std::collections::LinkedList |
- | std::set |
std::collections::HashSet , std::collections::BTreeSet |
- | std::map |
std::collections::HashMap , std::collections::BTreeMap |
C has no standard collection classes or types. Some libraries offer collection APIs such as glib or cii.
Iterators
Iterators are a reference to a position in a collection with the means to step through the collection one element at a time.
C++
C++11 provides a shorthand way of iterating a collection:
std::vector<char> values;
for (const char &c : values) {
// do something to process the value in c
}
Iterators are more explicit in C++98 and before and the code in C++11 is basically equivalent to this:
std::vector<char> values;
for (std::vector<char>::const_iterator i = values.begin(); i != values.end(); ++i) {
const char &c = *i;
// do something to process the value in c
}
This is quite verbose, but essentially each collection type defines a mutable iterator
and immutable const_iterator
type and calling begin
returns an iterator to the beginning of the collection. Calling the ++
operator overload on the iterator causes it to advance to the next element in the collection. When it hits the exclusive value returned by end
it has reached the end of the collection.
Obviously with an indexed type such as a vector
you could also reference elements by index, but it is far more efficient to use iterators for other collection types.
Processing collections
C++ provides a number of utility templates in
Rust
Rust also has iterators which are superficially similar fashion to C++ - incrementing their way through collections.
However the iterator concept is taken a LOT further in Rust. Iterators can be chained together in Rust to produce some powful and terse operations.
A conventional loop might look like this:
let values = vec![10, 20, 30, 40, 50, 60, 70, 80, 90, 100];
for v in &values {
println!("Value = {}", v);
}
In this instance, the iterator is invisible. The value v
is a reference to the i32
of the currently iterated element so each iteration prints a different value.
However we can also be explicit and obtain the iterator if we desire and apply an action to each value:
let values = vec![10, 20, 30, 40, 50, 60, 70, 80, 90, 100];
values.iter().for_each(|v| println!("Value = {}", v);
In this example the code calls for_each
on the Iterator
which iterates over each element and calls the closure.
We can go further. Let's say we want to only print the first 5 results:
let values = vec![10, 20, 30, 40, 50, 60, 70, 80, 90, 100];
values.iter().take(5).for_each(|v| println!("Value = {}", v));
Now the code calls take
on the Iterator
which produces a Take<Iterator>
which iterates only 5 times before it ends.
Perhaps we want to produce a tuple, consisting of the index of the iterator and f64
result of dividing the number by 5.
let values = vec![10, 20, 30, 40, 50, 60, 70, 80, 90, 100];
let result = values.
iter().
enumerate().
map(|v| (v.0, *v.1 as f64 / 5.0) ).
collect::<Vec<(usize, f64)>>();
Breaking this down, we iterate, enumerate()
produces a tuple (usize, i32)
from the index and value, map()
creates a new tuple (usize, f64)
and then finally the result is gathered into a new collection.
As you can see iterators expose very powerful functions that are an efficient, terse and provide less chance for error than writing a loop.