- Data is summarized into hourly histogram buckets, enabling efficient reporting over long time periods. (Eventually I'll add coarser buckets, to support very long time periods.)
- There are the beginnings of a reporting engine.
- It's easy to add microbenchmarks.
- {read, write} a randomly selected entry from an int[] of size {16K, 16MB, 256MB}. One thing I hope to probe here is lifetime of data in the processor cache. If performance varies over time, that may suggest cache pollution from other VMs sharing a physical machine with us. (On reflection, the parameters I'm using probably need to be tweaked. Microbenchmarking is of course tricky, and I'm not an expert. RAM benchmarks might form a topic for a later post.)
- {read, write} 4K bytes from an int[] of size 16KB.
- Invoke Math.sin() one million times. (This was the "CPU" test from the original prototype.)
- A simple multiply-and-add loop.
- Read one small entry from a SimpleDB database, with or without consistency. (Same as original prototype.)
- Write one small entry to a SimpleDB database. (Again, same as original prototype.)
Here's a snapshot as of this writing (all times in milliseconds):
Operation | # samples | Min | 10th %ile | Median | Mean | 90th %ile | 99th %ile | 99.9th %ile | Max |
Read 4 bytes from a 16KB buffer (10,000,000 times) | 605 | 0 | 21.9 | 54.9 | 56 | 76.1 | 126.9 | 667.5 | 707 |
Write 4 bytes to a 16KB buffer (10,000,000 times) | 605 | 0 | 20 | 53 | 51.7 | 69.2 | 115.5 | 414.5 | 423.3 |
Read 4K bytes from a 16KB buffer (100,000 times) | 605 | 0 | 479 | 594.5 | 823.6 | 1589.9 | 1985.5 | 2095 | 2032.8 |
Write 4K bytes to a 16KB buffer (100,000 times) | 605 | 0 | 260 | 317.3 | 449.6 | 843.5 | 1877.3 | 2003.1 | 2005.6 |
Read 4 bytes from a 16MB buffer (1,000,000 times) | 605 | 0 | 25.9 | 64.7 | 62.5 | 84.6 | 169 | 311.4 | 311.7 |
Write 4 bytes to a 16MB buffer (1,000,000 times) | 605 | 0 | 59.1 | 123.3 | 119.5 | 152.3 | 398.2 | 1182.6 | 1245.8 |
Read 4 bytes from a 256MB buffer (1,000,000 times) | 605 | 0 | 32.3 | 79.3 | 80.5 | 99.2 | 271.3 | 1075.1 | 1051.8 |
Write 4 bytes to a 256MB buffer (1,000,000 times) | 605 | 0 | 93 | 135.9 | 149.9 | 164.8 | 774.2 | 1574 | 1562.6 |
1000000 repetitions of Math.sin | 605 | 0 | 341 | 395.8 | 597.3 | 1143.8 | 1954 | 2011.2 | 1987.9 |
10000000 repetitions of integer multiplication | 605 | 0 | 11 | 22.8 | 26.4 | 44.5 | 78.6 | 376.8 | 365.6 |
Read (inconsistent) | 605 | 0 | 21.6 | 36.2 | 43 | 71.9 | 112.8 | 376.8 | 367.8 |
Read (consistent) | 605 | 0 | 22.9 | 37.5 | 45.1 | 73.5 | 134.5 | 606.9 | 622.4 |
Write | 605 | 0 | 51.6 | 75 | 94.4 | 124 | 662.6 | 1430.9 | 1451.6 |
[Update: removed broken links from the table above. If you click on the dashboard link above, you'll see a table similar to this one, but with histogram links included.]
I'll wait for more data, and a better reporting tool (in particular, the ability to graph changes over time), before discussing these results. I plan to add the following microbenchmarks in the near future:
- Disk performance: {read, write} {small, large} blocks of data at random offsets in a file. A very large file tests cache-miss performance; a smaller file could test cross-VM cache pollution.
- Network: ping another EC2 instance with {small, large} requests.
- Simple tests of Amazon's RDS (hosted MySQL) service, similar to the SimpleDB tests.
- AppEngine tests -- as many of the AWS tests as are applicable. (Local-disk tests are not applicable under AppEngine. A form of network test is possible, but it would not be directly comparable to the EC2 test, as I don't believe AppEngine supports socket-level network access.)
- Tests for AppEngine's memcache service.
No comments:
Post a Comment