This course gives practicing C programmers an inside look at the way C applications interact with libraries, operating systems, and hardware, focusing on the development of optimal performance in areas of both latency and concurrency. The course begins with a discussion of the nature of performance in software and the balance between latency and concurrency and techniques for accurate and objective performance measurement are illustrated. Coverage of memory models follows, describing the performance considerations associated with registers, cache, physical and virtual memory. Memory distribution architectures are examined along with C mechanisms available to optimally implement portable, memory-intensive application logic.