The CPU/Network Tradeoff -

Should data be brought to the compute or vice versa?

This is an old question in the ‘storage centric’ workload realm. While being old, I am not aware of any cost model that tries to answer that question in terms of price, that is: What costs more? Bringing the data to the compute node or placing more CPU power where the data resides? Be it my ignorance or the actual non existence of such model – here is a simple pricing model that attempts to answer that question.

The Pricing Model

The pricing model definition consists of 3 metrics explained below.

CPU time per GB, measured in Sec/GB
CPU core price per second, measured in $/Sec
Network pipe price per second, measured in $/Sec

Using the above 3 metrics and throwing in the network pipe speed we can compare the prices of moving a certain amount of data Vs. running a computation over it. For the former take the data size divide it by the network speed and multiply by the network pipe per second price. For the latter multiply the data size by the time per GB and then by the CPU price per second.

CPU time per GB

The CPU time per GB of a given workload – CG_W in short – is defined as the time in seconds it takes the workload W to process 1GB worth of data, when running on a single 100% utilized core. Note that CG_W is measured in [Sec/GB]. To determine the actual CG_W of a workload W, one should probably take an average over a mixture of representative input data of W and normalize by the average actual CPU utilization in case the core was not fully utilized. That is, if CG_Wis 10[Sec/GB] but the average utilization is 10%, then CG_Wis essentially 1[Sec/GB].

CPU core price per second

This unit tells us what the price of a CPU per second is. A simple way to get a number here is to take the price of a given CPU divide it by the number of cores it has (counting hardware threads) and by the number of seconds in 3 years. The actual number of years is not very important. To make the model more accurate we may want to throw in the power usage, but let us keep things simple for now. The unit is measured in [$/Sec]. We will use CS to denote this unit.

Network pipe price per second,

This unit tells us what is the cost per second to move bits from one machine to another. Moving data from one machine to another requires at least two host ports and two switch ports. To get a price of a switch port per second, we take the price of a switch, divide it by its number of ports, and by the number of seconds in 3 years. Similarly, we can get the price per second of a host port. We will use PS to denote this unit. Once again, the model can be made more accurate taking OPEX into account.

Putting the model together

One more unit that we will need is the network coefficient or NC in short. This coefficient represents the network speed normalized to GB (Giga Bytes). In other words, given a data size in GB, we want to know how much time in seconds it takes to move it. For 10Gbps Ethernet, this would be 8/10 times the data size in GB. Note that the units of the NC are [Sec/GB] (it takes 8/10 seconds to move 1GB worth of data over a 10Gbps link): We multiply by 8 as the network speed is given in bits (10Gbps), and we are measuring data in Bytes. We divide by 10, again as the speed is 10Gbps.

Given all these units, we can now spell out the answer we are looking for. Given a workload and an input size:

The CPU price to run the workload W over the data is: GC_W[Sec/GB] X size [GB] X CS [$/Sec]
The network price to move the workload’s data is: NC[Sec/GB] X size [GB] X PS [$/Sec]

Some Simple Examples

To get a better feeling of the model, I am adding some examples. The actual numbers should really be taken with a grain of salt for various reasons listed below. Again, the idea is really to get a feel of the model. The workload in question is a simple grep. More precisely we run ‘time grep FRA data.csv > /tmp/example’, where data.csv is a 130MB CSV file. 130MB are ~0.127GB. Thus, taking the (real) time we got from the above command and dividing it by 0.127 we get the GC_Ws[DC1] values below. Refer to this spreadsheet for all the arithmetics.

We have considered the following CPUs:

Intel Core i5 2.8 GHz – 4.61[Sec/GB]
Intel E5-2680 v3 @ 2.50GHz – 1.69[Sec/GB]
Broadcom ARM11 processor – 3 [Sec/GB]

Using Internet prices for the cores (at the time of writing), and taking into account the number of hardware threads in each CPU we get the $/Sec prices of each CPU (see spreadsheet). Also, I could not find the price for ARM11 (of any manufacturer), so the quoted price in the sheet is for a whole PI board that uses this CPU. For the Pipe/Sec price I used 10Gbps quotes for Cisco WS-C4948-10GE-S Catalyst 4948-10GE 48 Port Switch, and Intel Ethernet Converged Network Adapter X540-T2 Pci Express 2.1 X8 Low Profile Putting all the numbers into the spreadsheet we get the following prices. The network price should be read as the price to move the bits, and the CPU prices as the prices to run this certain workload on each of the CPUs.

Clearly, besides the CPU there are many other differences between the tested systems (RAM type, storage device where the data is kept, etc.) While some of the differences can be mitigated (e.g. using a RAM disk) they will always be there. On the other hand we are not after comparing different CPUs as we are comparing a given system with a given CPU to a given network setup.

Some random thoughts

Cores and network scale differently: It is much easier to add network ports than cores to a compute node. Also, cores do not ‘scale linearly’.
The way the network price is calculated is really a ‘lower bound’:
- Typically, the number of network hops for the data to be read from the storage system is at least two: a client facing layer and storage layer.
- Calculating the network time was done according to network speed and data size not taking e.g. TCP overhead.
The model can be easily extended to take into account power. Take the socket power consumption and divide by the number of hardware threads. Take the switch power consumption divided by the number of ports. Multiply the resulting numbers by the electricity price at your favorite location normalized to seconds.
The model really compares compute and network prices. Clearly after moving the data using the network we would still need a CPU on the other side…

The CPU/Network Tradeoff