HT is a typesafe, open addressing, linear probing hash table that uses MurmurHash for its hashing function. Memory allocations are automatically
managed internally and any insert operation may cause a grow (full re-hash) at any time. By default, the keys are NULL-terminated C strings with
externally owned memory. The hash table does not make internal copies of the keys, and any pointer passed into HT_insert() as a key must
remain valid for any operation on the hash table's contents other than freeing them (HT_destroy).
Alternate key types can be specified as an optional first argument in the declaration. These keys are copied internally and space for a
key is allocated into every bucket whether it's in use or not.
This is one of my earlier data structures and the techniques used inside are naive and overly complicated, but I don't currently have time
to rewrite it to be more clear and flexible.
A hash table is declared using the following overloaded macro, where Kis the key's type and V is the value's type.
HT(V) varname;
HT(K, V) varname;
It must be initialized before use by calling HT_init(HT* o, size_t initial_bucket_count);.
The inital bucket count can be anything you feel is reasonable; the hash table grows automatically when it exceeds 75% capacity.
In this documentation, HT* will be used to refer to a pointer to a HT of unspecified type, which is used by all of the functions.
Usually this is the address of the actual variable, not a proper independent pointer to it.
Examples:
HT(int) ages;
HT_init(&ages, 50);
for(int i = 0; i < num_people; i++) {
HT_set(&ages, people[i].name, people[i].age);
}
Due to C's semantics involving anonymous structs, you have to typedef a HT of some certain type if you want to declare
pointers to it. I am not aware of a workaround at this time.
typedef HT(int) IntLookup;
// a dynamic array of (string-keyed) hash tables containing integer values
VEC(IntLookup) nicknames;
There are magic loop macros for iterating over the contents. They work just like they appear to, including the use of break and continue, and
are a single statement internally, allowing constructs like if(foo) HT_EACH(&bar, k, int, b) { printf("%d\n", b); } or nesting loops without
braces on the outer ones.
HT_EACH(VEC* o, [K] iterator, T, [T] value) { ... }
HT_EACHP(VEC* o, [K] iterator, T, [T*] value_ptr) { ... }
iterator and value are both declared as local variables. The contents of the given value are copied into value before each iteration.
A pointer to the actual internal memory for the value is copied into val_ptr, allowing direct modification. Due to being a very old design,
you have to specift the type of value as the third argument to the macro. This will probably change in the future, but the macro will be
variadic to support older code.
Iteration happens in no particular order, and the behavior of the loop is undefined if the hash table is modified in any way during iteration.
These macros use HT_next() and HT_nextp() internally.
HT_EACH(&ages, name, int, age) {
printf("name: %s, age: %d \n", name, age);
}
HT_EACHP(&ages, name, int, age) {
*age += 1;
printf("Happy birthday, %s!\n", name);
}
void HT_init(HT* o, size_t alloc_size)
Initializes the hash table with alloc_size buckets, and allocates memory for them.
Frees all internal memory for the hash table. Default string keys are owned externally and not freed by this function.
Returns the current number of items in the hash table.