Home Rumble Youtube Twitter/X Kofi Contact / Crypto

Hash Table

HT is a typesafe, open addressing, linear probing hash table that uses MurmurHash for its hashing function. Memory allocations are automatically managed internally and any insert operation may cause a grow (full re-hash) at any time. By default, the keys are NULL-terminated C strings with externally owned memory. The hash table does not make internal copies of the keys, and any pointer passed into HT_insert() as a key must remain valid for any operation on the hash table's contents other than freeing them (HT_destroy).

Alternate key types can be specified as an optional first argument in the declaration. These keys are copied internally and space for a key is allocated into every bucket whether it's in use or not.

This is one of my earlier data structures and the techniques used inside are naive and overly complicated, but I don't currently have time to rewrite it to be more clear and flexible.

A hash table is declared using the following overloaded macro, where Kis the key's type and V is the value's type. HT(V) varname; HT(K, V) varname; It must be initialized before use by calling HT_init(HT* o, size_t initial_bucket_count);. The inital bucket count can be anything you feel is reasonable; the hash table grows automatically when it exceeds 75% capacity.

In this documentation, HT* will be used to refer to a pointer to a HT of unspecified type, which is used by all of the functions. Usually this is the address of the actual variable, not a proper independent pointer to it.
Examples: HT(int) ages; HT_init(&ages, 50); for(int i = 0; i < num_people; i++) { HT_set(&ages, people[i].name, people[i].age); } Due to C's semantics involving anonymous structs, you have to typedef a HT of some certain type if you want to declare pointers to it. I am not aware of a workaround at this time. typedef HT(int) IntLookup; // a dynamic array of (string-keyed) hash tables containing integer values VEC(IntLookup) nicknames; There are magic loop macros for iterating over the contents. They work just like they appear to, including the use of break and continue, and are a single statement internally, allowing constructs like if(foo) HT_EACH(&bar, k, int, b) { printf("%d\n", b); } or nesting loops without braces on the outer ones.

HT_EACH(VEC* o, [K] iterator, T, [T] value) { ... } HT_EACHP(VEC* o, [K] iterator, T, [T*] value_ptr) { ... } iterator and value are both declared as local variables. The contents of the given value are copied into value before each iteration. A pointer to the actual internal memory for the value is copied into val_ptr, allowing direct modification. Due to being a very old design, you have to specift the type of value as the third argument to the macro. This will probably change in the future, but the macro will be variadic to support older code.

Iteration happens in no particular order, and the behavior of the loop is undefined if the hash table is modified in any way during iteration. These macros use HT_next() and HT_nextp() internally. HT_EACH(&ages, name, int, age) { printf("name: %s, age: %d \n", name, age); } HT_EACHP(&ages, name, int, age) { *age += 1; printf("Happy birthday, %s!\n", name); }

void HT_init(HT* o, size_t alloc_size)
O(n)
Initializes the hash table with alloc_size buckets, and allocates memory for them.
void HT_destroy(HT* o)
O(1)
Frees all internal memory for the hash table. Default string keys are owned externally and not freed by this function.
size_t HT_fill(HT* o)
O(1)
Returns the current number of items in the hash table.
size_t HT_alloc(HT* o)
O(1)
Returns the current total number of buckets in the hash table. This will always be at least 25% bigger than the fill.
int HT_get(HT* o, K key, V* val_ptr)
O(1)
Looks up the provided key and copies the value into the memory pointed to by val_ptr if it exists. If the key is not present, the memory pointed to by val_ptr is not changed. Returns 0 if the key exists, 1 if it does not.
int HT_getp(HT* o, K key, V** val_ptr_ptr)
O(1)
Looks up the provided key and copies a pointer to the values internal memory into the pointer pointed to by val_ptr_ptr if it exists. If the key is not present
int HT_set(HT* o, K key, V* val_ptr)
O(1)
Inserts the key and value into the table, overwriting any existing value without notice. Any memory pointed to by key is externally owned and is not copied into the hash table. Returns 0 if successful.
int HT_delete(HT* o, K key)
O(1)
Deletes the provided key from the table. Returns 0 if successful, even if the key did not exist.
int HT_next(HT* o, void** iterator, K* key_ptr, V* val_ptr)
O(1)
Facilitates manual iteration over the hash table. Fills key_ptr and val_ptr with the next key and value based on iterator, then increments opaque iterator. Set iterator to NULL to start. Returns 1 if there are more items remaining, and 0 when iteration is complete allowing this function to be placed directly into a while() loops condition. Iteration happens in no defined order
int HT_nextp(HT* o, void** iterator, K* key_ptr, V** val_ptr_ptr)
O(1)
Works like HT_next() except that val_ptr_ptr is filled with a pointer to the internal memory for the value, allowing modification.
// iteration. no order. results undefined if modified while iterating // returns 0 when there is none left // set iter to NULL to start