每个程序员都应该知道的数值

最近看到了一篇文章,主要介绍了Jeff Dean在2009年LADIS(Large-Scale Distributed Systems and Middleware)会议上的一次talk中所提到的一组每个程序员都应了解的数值。

下面的数值在Jeff Dean的基础上有所改进:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Latency Comparison Numbers
--------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
Read 1 MB sequentially from memory 250,000 ns 250 us
Round trip within same datacenter 500,000 ns 500 us
Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory
Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip
Read 1 MB sequentially from disk 20,000,000 ns 20,000 us 20 ms 80x memory, 20X SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms
Notes
-----
1 ns = 10^-9 seconds
1 us = 10^-6 seconds = 1,000 ns
1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns
Credit
------
By Jeff Dean: http://research.google.com/people/jeff/
Originally by Peter Norvig: http://norvig.com/21-days.html#answers
Contributions
-------------
Some updates from: https://gist.github.com/2843375
'Humanized' comparison: https://gist.github.com/2843375
Visual comparison chart: http://i.imgur.com/k0t1e.png
Animated presentation: http://prezi.com/pdkvgys-r0y6/latency-numbers-for-programmers-web-development/latency.txt

关于性能和延迟的基本概念,可以参看Randal E. Bryant的Computer Systems A Programmer’s Perspective。

为了演示缺页带来的开销,写了下面两个程序,这两个程序的功能都是将一个二维数组中所有元素置0,但第一个程序有更少的缺页,第二个会有更多的缺页(cache miss相同)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// This code makes cache and memory management sub-system happy
#include <stdio.h>
#include <stdlib.h>
#define ROW (1 << 10) // 1K bytes
#define COL (1 << 19) // 1M
int main(void)
{
unsigned long (*arr)[ROW] = NULL;
arr = malloc(sizeof(unsigned long) * ROW * COL);
for (int i = 0; i < ROW; ++i)
for (int j = 0; j < COL; ++j)
arr[i][j] = 0;
return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// This code makes cache and memory management sub-system unhappy
#include <stdio.h>
#include <stdlib.h>
#define ROW (1 << 10) // 1K bytes
#define COL (1 << 19) // 1M
int main(void)
{
unsigned long (*arr)[ROW] = NULL;
arr = malloc(sizeof(unsigned long) * ROW * COL);
for (int i = 0; i < COL; ++i)
for (int j = 0; j < ROW; ++j)
arr[i][j] = 0;
return 0;
}

第一个程序运行时间:2.459 total

第二个程序运行时间:8.655 total

如果我们假设CPU周期为一秒的话,那么更加直观的等价时长见下图:


¶ The end

Share Link: http://d0u9.win/posts/3650089173.html