This was first published on https://blog.dbi-services.com/amazon-aws-instances-and-oracle-database-performance (2017-02-01)
Republishing here for new followers. The content is related to the the versions available at the publication date
When you run Oracle Database on Amazon AWS you Bring Your Own Licenses depending on the number of virtual cores (which are the number of cores allocated to your vCPUs). Behind the instance types, you have different processors and hyper-threading. Then, when choosing which instance type to run, you want to know which processor offers the best performance for your Oracle Workload. Here is an example comparing the logical reads on T2, M4, R4 and C4 instances. My comparison is done running cached SLOB (https://kevinclosson.net/slob/) to measure the maximum number of logical reads per seconds when running the same workload on the different instance types. I’ve compared what you can have with 2 Oracle Database processor licences, which covers 2 cores (no core factor on AWS) which means 2 vCPU for T2 which is not hyper-threaded, and 4 vCPU for the others.
I was on Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
With one session:
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 13.1 0.00 5.37 DB CPU(s): 1.0 13.0 0.00 5.34 Logical read (blocks): 747,004.5 9,760,555.7With 2 sessions:
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 2.0 27.3 0.00 11.12 DB CPU(s): 2.0 27.1 0.00 11.04 Logical read (blocks): 1,398,124.7 19,111,284.0
T2 is not hyper-threaded which is why we double the LIOPS with two sessions. So with 2 Oracle licences on T2 we get 1.4 LIO/s
M4 is the latest General Purpose instance in EC2. It is hyper-threaded so with 2 Oracle processor licences we can use 4 vCPU. Here I was on Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, 2 cores with 2 threads each.
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 13.1 0.00 5.46 DB CPU(s): 1.0 13.1 0.00 5.46 Logical read (blocks): 874,326.7 11,420,189.2
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 2.0 27.3 0.00 9.24 DB CPU(s): 2.0 27.2 0.00 9.22 Logical read (blocks): 1,540,116.9 21,047,307.6
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 3.0 40.9 0.00 12.33 DB CPU(s): 3.0 40.8 0.00 12.30 Logical read (blocks): 1,645,128.2 22,469,983.6
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 4.0 54.6 0.00 14.46 DB CPU(s): 4.0 54.3 0.00 14.39 Logical read (blocks): 1,779,361.3 24,326,538.0
Those CPU are faster than the T2 ones. With a single session, we can do 17% more LIOPS. And running on all the 4 threads, we can reach 1.8 kLIOPS which is 27% more that T2 for same Oracle licences.
R4 is the memory-intensive instance. I was on Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz so I expect about the same performance as M4.
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 13.7 0.00 6.01 DB CPU(s): 1.0 13.7 0.00 6.01 Logical read (blocks): 864,113.9 11,798,650.6
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 2.0 27.3 0.00 9.38 DB CPU(s): 2.0 27.2 0.00 9.36 Logical read (blocks): 1,546,138.8 21,115,125.5
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 3.0 40.9 0.00 14.07 DB CPU(s): 3.0 40.9 0.00 14.05 Logical read (blocks): 1,686,595.4 23,033,987.3
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 4.0 54.6 0.00 15.00 DB CPU(s): 4.0 54.3 0.00 14.93 Logical read (blocks): 1,837,289.9 25,114,082.1
This one looks a little faster. It is the same CPU but cached SLOB does not test only CPU frequency but also memory access. R4 instances have DDR4 memory.
For my last test I choose the compute-optimized C4 with Intel(R) Xeon(R) CPU E5-2666 v3 @ 2.90GHz
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 1.0 13.7 0.00 6.83 DB CPU(s): 1.0 13.7 0.00 6.83 Logical read (blocks): 923,185.0 12,606,636.8
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 2.0 27.3 0.00 9.38 DB CPU(s): 2.0 27.2 0.00 9.36 Logical read (blocks): 1,632,424.3 22,296,021.5
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 3.0 39.2 0.00 13.64 DB CPU(s): 3.0 39.1 0.00 13.61 Logical read (blocks): 1,744,709.5 22,793,491.7
Load Profile Per Second Per Transaction Per Exec Per Call ~~~~~~~~~~~~~~~ --------------- --------------- --------- --------- DB Time(s): 4.0 54.6 0.00 15.79 DB CPU(s): 4.0 54.3 0.00 15.71 Logical read (blocks): 1,857,692.6 25,396,599.8
According to https://aws.amazon.com/ec2/instance-types/ C4 instances have the lowest price/compute performance in EC2. The frequency is 20% faster than R4, but we have similar IOPS. The CPU frequency is not the only parameter for database workload.
You should not compare only the EC2 instance cost (I’ve indicated the approximate cost for RHEL in Europe, but you can check pricing at https://aws.amazon.com/ec2/pricing/reserved-instances/pricing/). You should estimate the Oracle licences you need to run your workload. Creating an EC2 instance takes only few minutes. Installing Oracle from an ORACLE_HOME clone is also very fast and creating a database with SLOB create_database_kit is easy. Fully automated, you can run the same SLOB tests on an instance and get results after 2 hours. It is highly recommended to do that before choosing the instance type for your database. The number of cores will determine the Oracle licences to buy, which is an acquisition cost + a yearly maintenance fee. The goal is to run on the processors that gives the best performance for your workload.
Dear Franck,
this comparison helps for sure: a given shape (no matter which cloud provider you choose) is only a starting point for a discussion. I’ve personally seen so many people asking for clock-rate, memory size, … and many more details. All such figures are just simple indicators.
That is, why I strongly advise to bring the workload onto services and compare actual SLAs. An AWR/ASH is again a good indication (you don’t have to write anything yourself). But more importantly: can your app digest it when moving a service to the cloud. That leads to the second component in the cloud being really important: network – we all know it but have a subtle tendency to forget about it.
Franck, great info here. I’m curious what your SLOB and DB configuration was to execute the “cached” SLOB workload?
regards, Tony Sarno
Hi Tony, From default SLOB configuration, I’ve just set UPDATE_PCT=0 and SCALE=80M with larger SGA (sga_target=1900M ) Regards, Franck
Hi Frack – great test – most useful.
Could you post the SQL section from your AWRs? I am curious about QPS (Queries per Second) and execution plan.
Basically getting a reference for per core lookups.
Would you run a test on a max-size database CPUs – to capture the contention/scalability capabilities of Oracle. At high core-counts mutexes and latches should come into play.
Hi Christo, I didn’t find time to check before, but if you want I can send you the AWR reports for all those tests. Basically, it is SLOB, so you can find something like that:
which is executed 28529 times per second on the latest test (c4.xlarge)