Tag: hashtables

Implementing Understandable World Class Hash Tables in C++ – by Eduardo Madrid & Scott Bruce CppCon 2022

  • Lobby
  • Tag Archives: hashtables

https://cppcon.org/
---

A Success Story in Implementing Understandable World Class Hash Tables in Cpp - Eduardo Madrid and Scott Bruce - CppCon 2022
https://github.com/CppCon/CppCon2022

We present a success story about implementing one of the most important data structures, hash tables, with world class performance.

C++ allowed maintaining the Robin Hood hash table invariant (abbreviated RH) with the performance of parallel computation, even using only normal integer operations:

1. The execution costs of maintaining this invariant were minimized
2. The benefits are realized beyond other implementations
3. The actual code is easier to understand than alternatives in the performance S-tier

RH is a powerful invariant that allows implementations to unlock non-obvious and substantial benefits including cache locality.

Our solution for this challenge is to use parallel computation techniques that require only normal integer processor operations, and can be scaled up to "SIMD" operations (SIMD means "Single Instruction Multiple Data").

The parallel programming abstraction layer is composable, programmer understandable, and friendly to the optimizing compiler. We will prove these assertions with real source code, benchmarks and Compiler Explorer disassembly.

This will illustrate how specifics of RH and the scalable SIMD-based implementation gives not-seen-before improvements: using the bits devoted to "Probe Sequence Length" not just for cache locality but also reducing false matches exponentially.

These examples gives us confidence that fully realizing the powers of this invariant will outperform alternative schemes.
---

Eduardo Madrid

Tech Lead at Snap, Inc., helping performance on augmented reality of the Snapchat app. Author of open source libraries and components such as the Zoo type-erasure framework. Prior experience at Fintech, including automated trading. Several times presenter at CPPCon, C++ Now, C++ London

Scott Bruce

Scott has been writing software professionally for 20+ years.
He's been entranced by performance and efficiency at least 15 of those 20+ years.
He worked in distributed systems, Search, Adwords, Adsense and ML at Google for 13 years. He worked 4.5 years at Snapcht, on monetization systems, performance, and advanced mobile telemetry systems. He is currently an engineer at Rockset, working on distributed system performance and real time analytics. Presenter of a production software talk at Google, Snap, UCLA, USC, Caltech and others over the last ten plus years.
---

Videos Filmed & Edited by Bash Films: http://www.BashFilms.com
YouTube Channel Managed by Digital Medium Ltd https://events.digital-medium.co.uk

#cppcon #programming #hashtable

Filed under: UncategorizedTagged with: , , , ,

Scalable and Low Latency Lock-free Data Structures in C++ – by Alexander Krizhanovsky – CppCon 2022

  • Lobby
  • Tag Archives: hashtables

https://cppcon.org/
---

Scalable and Low Latency Lock-free Data Structures in C++ - Alexander Krizhanovsky - CppCon 2022
https://github.com/CppCon/CppCon2022

Imagine that your program uses many threads, which insert and lookup millions times per second in a large data structure like std::map or std::unordered_map. Typically, you have to switch to a lock-free data structure for this task. Lock-free approaches perfectly scale data structures for multi-core systems, but hash tables and trees need some reorganization as more and more items are inserted and these reorganizations are hard to make lock-free. But the problem isn't only in contention and we also need to efficiently work with memory to develop a high performance data structure for the modern hardware. Cache conscious data structures address the problem by efficient usage of CPU caches and reducing the number of accesses to the main memory.

This talk makes a quick survey over several lock-free (primarily variations of hash table) and cache conscious (mostly trees) data structures. We'll discuss the best use cases for the data structures and their limitations. Besides performance in average cases, we'll also focus on worst case scenarios, which may introduce high tail latency on large busy systems. Tail, or high percentile, latency is a severe problem since it may reach significant values and the small percent of problem cases can be not so small in absolute values, e.g. if you service 1M users, then only 0.1% of them experiencing high processing time is a problem.

The main part of the talk is a step by step C++ implementation of Hash Trie, a hybrid lock-free cache conscious data structure, which provides good access time in average and worst case. We'll do microbenchmarks of the data structure and compare it with other data structures.

You'll learn about:
* when standard containers and locking mechanisms aren't enough
* several advanced data structures: split ordered lists and other variations of lock-free hash tables, tries (partricia trees) and hybrid data structures
* x86-64 memory ordering and cache hierarchy, operating system preemption and how to employ all the knowledge to implement a very fast data structure
* gotchas of data structures benchmarking, such as keys distribution, latency vs throughput, worst cases and so on
* an open source lock-free cache conscius Hash Trie implementation
---

Alexander Krizhanovsky

Alexander is the CEO of Tempesta Technologies, Inc., and is the architect of Tempesta FW, a high performance open source Linux application delivery controller. Alexander is responsible for the design and performance of several products in the areas of network traffic processing and databases. He designed the core architecture of a Web application firewall, mentioned in the Gartner Magic Quadrant '15, and the MariaDB temporal data tables.

Alexander gave talks at Netdev, SCALE, Linux Conf Australia, MariaDB user conferences, All Things Open, FOSDEM, Percona Live, IBM CASCON, and many other conferences. Alexander is also the author of a very fast lock-free MPMC ring buffer queue, published by the Linux Journal in 2013.
__

Videos Streamed & Edited by Digital Medium: http://online.digital-medium.co.uk

#cppcon #programming #datastructures

Filed under: UncategorizedTagged with: , , , , ,