Skip to content
Snippets Groups Projects
Commit aeef27b4 authored by Joshua Liebow-Feeser's avatar Joshua Liebow-Feeser
Browse files

[recovery-netstack] IMPROVEMENTS.md: Add improvements

- Add dedicated "Performance Issues" section
- Add "Single-threaded execution" sub-section
- Add "`debug_err!` macro" sub-section
- Add "Pervasive use of `HashMap`s" sub-section
- Expand "Devices in `HashMap`s" sub-section

Change-Id: I6721b556cf21edf03f24fc21b3282b4e5fb64461
parent e658fff2
No related branches found
No related tags found
No related merge requests found
......@@ -4,7 +4,125 @@ This document describes various improvements that we might want to make to
structure or algorithms in the future, but that we don't have concrete plans to
implement. It's essentially a bucket of ideas we don't want to forget.
# Keep Devices in `Vec`s, not `HashMap`s
## Performance Issues
This section documents design and implementation details that we speculate may
cause performance issues.
### Single-threaded execution
*Contact: joshlf@google.com*
Our single-threaded execution model is great for simplicity, but will likely run
into problems in practice depending on our performance goals. These include:
- Large state changes (add/remove device, update forwarding table, etc) take
much longer to compute than a normal event like receipt of a packet, and stall
the entire event loop until they complete. This may introduce unacceptably
large latency spikes.
- If packet flow is high enough (e.g., in the case of a router forwarding
traffic for a network), the single execution thread may not be able to keep up
with throughput.
#### Possible solutions
- Use [read-copy-update](https://en.wikipedia.org/wiki/Read-copy-update) (RCU)
or any other read-optimized architecture. It is likely that any performance
goals we have regardless of application domain will heavily favor read
throughput over write throughput. "Reading" here refers to reading shared data
structures like device state, connection state, forwarding tables, etc. In a
read-optimized architecture, most threads could spend most of their time
handling common events like packet receipts or application client messages,
and stay on the optimized, read-only hot path most of the time. Any write
operations such as device insertion/removal, forwarding table update, etc
would likely perform much worse than in the current model, but that's probably
an acceptable tradeoff.
Note that RCU and similar schemes can incur significant complexity. However, I
believe that Rust should make it much easier to encapsulate this complexity
such that most of the stack can remain agnostic to these details.
Another thing to keep in mind with this approach is that the read/write
distinction is not a perfect one. For example, handling a segment in a TCP
connection involves read-only operations on global data structures such as the
forwarding table, but involves write operations on the TCP connection state.
It's likely that, even in an read-optimized world, coarser-grained
synchronization around less global state like TCP connection state would be
acceptable.
### `debug_err!` macro
*Contact: joshlf@google.com*
The `debug_err!` macro is used to emit a debug log and then return an error
value. The `debug_err_fn!` macro produces a closure which calls `debug_err!`.
They are both used extensively in the `wire` module during packet parsing.
As of this writing, `debug_err!` is implemented as follows:
```rust
/// Emit a debug message and return an error.
///
/// Invoke the `debug!` macro on all but the first argument. A call to
/// `debug_err!(err, ...)` is an expression whose value is the expression `err`.
macro_rules! debug_err {
($err:expr, $($arg:tt)*) => (
{
use ::log::debug;
debug!($($arg)*);
$err
}
)
}
```
I speculate that the unconditional use of the `debug!` macro may cause
performance problems as, by default, it checks to see whether debug logging is
enabled. One way to alleviate this issue would be some kind of conditional
compilation. E.g., we could add a compile-time flag to enable or disable the
`debug!` invocation entirely. We could also use the `log` crate's
`release_max_level_xxx` features to disable debug logging crate-wide.
### Pervasive use of `HashMap`s
*Contact: joshlf@google.com*
In order to play nicely with Rust's ownership model, we generally pass `HashMap`
keys around where implementations in other languages might pass pointers. This
allows most functions to take a mutable reference to the entire state, but also
requires that we do fairly frequent hash map lookups even in the hot path.
In certain cases, it may make sense to replace hash maps with vectors (e.g., see
"Devices in `HashMap`s" section below). However, vectors require either linear
scanning or keys which can be used to index into the vector. Neither of these is
viable in the general case due to either large sets of data (too large to
linearly scan) or large, unstructured keys (unsuitable for indexing directly
into a vector).
#### Possible solutions
- Use `Rc<RefCell<T>>` for hash map entries. This allows us to cache
reference-counted pointers which point directly to the hash map entries. It
requires a reference count check in order to access the `RefCell`, although
this is undoubtedly much faster than a hash map lookup. The big disadvantage
is that it requires heap allocating the hash map entries rather than storing
them inline, which could significantly worsen both cache locality and cache
utilization.
- Use `BTreeMap`s. `BTreeMap`s are not quite as general as `HashMap`s since they
require their keys to be ordered, but in practice, most of our keys are. For
smaller maps, `BTreeMap`s are superior due to not having to compute a hash.
They are still pretty good for cache locality and utilization, although not as
good as `HashMap`s. When they get large, they lag behind `HashMap`s. We would
need to figure out where the cutoff point is, and how large we expect our
working sets to be for various map instances. DoS vulnerability may be also be
a concern for `BTreeMap`s (a maliciously-chosen set of keys could induce the
pathalogical case of a tree becoming a linked list). `HashMap`s can protect
against this with DoS-resistant hash functions.
- Use more special-purpose data structures in particular cases. For example,
maps keyed by IPv4 addresses could be stored in a trie where each level is
keyed off of a byte of the IPv4 address, leading to a maximum trie depth of 4.
- For any of these solutions, dynamically choose based on working set size or
other runtime characteristics. For example, we could provide a map type which
uses a `BTreeMap` below a certain size, and switches to a `HashMap` at larger
sizes.
### Devices in `HashMap`s
*Contact: joshlf@google.com*
......@@ -41,7 +159,9 @@ to get sparse. However, it's likely that this would happen very slowly, as
devices are not removed and added with any frequency. Another problem would be
that, because all device IDs share the same namespace regardless of device type,
all of the `Vec`s would have to be long enough to accomodate every device ID,
even device IDs of other types.
even device IDs of other types. This could, in turn, be alleviated by allocating
device IDs in pools (e.g., using the top 32 bits to identify the type of device,
and the bottom 32 bits to identify the device itself).
Another concern is security - is it important that device IDs be unpredictable?
Are we concerned that sequential device IDs might leak information that we don't
......@@ -53,4 +173,9 @@ enum of a `HashMap` and a `Vec`, and have a cutoff point at which, once the
Another option would be to retain randomly-generated IDs, and simply keep
devices in a `Vec` which is scanned linearly on lookup. For small numbers of
devices, this might still be faster than a `HashMap`.
\ No newline at end of file
devices, this might still be faster than a `HashMap`.
*NOTE: It [seems likely](https://github.com/rust-lang/rust/pull/56241) that
Rust's `HashMap` will see significant speed improvements in the near future,
which may make the tradeoffs here lean somewhat more in favor of using
`HashMap`s*
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment