1. Cryptocurrencies Are Detrimental to Society

    March 27, 2020

    Introduction

    How would you explain the inner workings of bitcoin to a person in simple, understandable terms?

    twitterimage

    source

    This explanation seems perfect to me because it illustrates some seriously problematic aspects of cryptocurrencies in one simple sentence.

    It captures the unimaginable energy waste of mining cryptocurrencies. And it also captures the dark side of cryptocurrencies: facilitating crime.

    At this point cryptocurrency enthusiasts are rolling their eyes and sigh. This point has been made many times over. I know, I'm definitely not the first to criticise cryptocurrencies1 this way.

    I do have a simple challenge though:

    Can you please show me the benefit to society of cryptocurrencies?

    Please, don't come up with theoretical or future possibilities. Cryptocurrencies have existed for eleven years, they should have something to show for right now. After many hours of reading up on the topic, I have not been able to find any tangible benefits that would justify the effort and resources spend on them.

    The downsides of cryptocurrencies are an entirely different matter. They are very, very clear to me. But let's not go there directly. What fun would that be?

    The solution in search of a problem

    As I see it, cryptocurrencies are entangled in a desparate search for a problem to solve. They are the answer to a question nobody asked2.

    Many cryptocurrency advocates see the decentralised (distributed) nature of cryptocurrencies as a tangible benefit. Cryptocurrencies are most often not controlled by any government or single entity.

    Aside from whether this is true in practice3, they seem to imply that it's a bad thing that governments control their own currencies. Well, last time I checked governments control their currencies to keep them as stable as possible. Frankly, that actually sounds like exactly what I would want from a currency.

    Stability.

    You can call cryptocurrencies many things but 'stable' is definitely not one of them. Cryptocurrencies are highly volatile. In cryptocoin, a loaf of bread could suddenly cost twice as much as the day before.

    Although cryptocurrencies haven't seen any significant adoption as a payment method - due to their volatility - it has seen adoption in 'less stable countries' where it is basically the 'lesser of two evils'. I mean: you know things are bad if a volatile cryptocurrency is a safer option than the native currency of your country.

    The argument in the end boils down to: if your society is already in serious trouble, maybe cryptocurrencies could provide 'some benefit'. If the people involved have reliable access to internet. And internet access is most often controlled by the government.

    To my knowledge, cryptocurrency as a payment method has actually only seen true adoption within the world of dark markets such as Silk Road in wich bitcoin rose to prominence. Nonetheless, after eleven years, cryptocurrencies have no traction in the regular 'legitimate' markets as a real payment method.

    The reason why is obvious: existing payment methods are much easier to use and feel much safer. And the volatility of cryptocurrencies only compounds to the support of these conventional methods.

    If cryptocurrencies are a solution to anything at all they seem to be 'bad' solutions at best.

    Are cryptocurrencies in fact a Ponzi or Piramid scheme?

    It depends on the particular currency, but I think the case can be made for sure. I mean: why not both?

    Cryptocurrencies have no intrinsic value. They are only worth what people are willing to pay for them. So the cryptocurrency advocates needed to drum up demand, to create a market where previously none existed. This resulted in wild visions of the future, elaborate jargon-filled smokescreens that argue cryptocurrencies would take over the world. Get in quickly or you miss out!

    And so many people were afraid to miss out, creating an enormous cryptocurrency hype, starting in November of 2017, spilling into 2018 when the bubble burst. The end result? A few people got very rich and the vast majority lost money.

    It seems to me this all is a combination of a ponzi scheme with the component of active recruitment found in piramid schemes. The value of the currencies must come from somewhere, right?

    So please tell me how all of this benefits our society? Creating a handful of rich people at the expense of a lot of other people? Is that it?

    The graveyard of dead cryptocurrencies only shows how many people or startups try to get a piece of the action. And so many of them are outright scams. Are in fact all cryptocurrencies scams at their core?

    The downsides of cryptocurrencies

    I hope I have established that cryptocurrencies provide no tangible benefits to society. But the do have a lot of downsides. I observe the following:

    TopicRemark
    Trafficing of illegal goodsDrugs, weapons, childpornography, and so on.
    Trafficing of illegal servicesMurder for hire
    Tax evasion-
    Money laundering-
    RansomwareHold data hostage in encrypted form
    Polution / energy wastecrypto miners use a lot of electricity
    Lack of Securityseveral cryptocurrency exchanges have been hacked
    Crypto scamsNew cryptocoins are created just to scam people

    The easy retort to this table is: "'normal' currencies like the dollar or euro also facilitate almost all of those illegal things", which is true but it misses the larger point.

    Those regular currencies provide tremendous value to our societies. Our societies are build upon them and they facilitate almost everything we do. The topics listed in the table are just a possible negative side-effect for regular currencies. Their clear benefits outweighs such downsides.

    Cryptocurrencies don't seem to have any such upsides. They seem to be made to exclusively facilitate cryptocurrency speculation and crime.

    Cryptocurrencies facilitate crime

    I won't discuss all the topics in the previous table but I do want to highlight a few.

    Dark markets

    I think we all remember Silk Road, a now defunct darknet marketplace that allowed people to anonymously buy - amongst other things - drugs and guns4. Silk Road was the first large-scale application of bitcoin as a means of payment.

    Silk Road started out with just drugs, but guns soon followed. This deeply depraved world of dark markets are very much enabled by cryptocurrencies because the parties involved in a transaction are so hard to identify.

    Ransomware

    Ransomware seems to be almost exclusively enabled by digital currencies. Not explicitly just cryptocurrencies, but they do enable this type of crime because tracing the payments back to the criminal is so difficult.

    The damage caused by ransomware is so obviously devastating. You can have many opinions on the fact that many critical organisations such as hospitals or universities don't have their computer security under control.

    The real problem is that cryptocurrencies make these kinds of attacks on businesses and institutions very low-risk and highly profitable. In my own country a [university][um] was targeted by such an attack and allegedly they paid the randsom. The disruption to its services was substantial.

    Energy waste

    As cryptocurrencies rose in value, it started to become profitable to 'mine' them. It started out with regular computers, but soon, we could use videocards to accelerate cryptocurrency mining, which are very power hungry.

    Later on, FPGAs and ASICS were build to further accelerate mining performance. Entire companies spun up to build those miners and host them in large datacenters with cheap electricity. The scale of the operation is rather enormous.

    According to an article dating to July 2019, just bitcoin mining consumes more electricity than Switzerland. The article links to an online tool that tracks this power usage in real-time based on some estimates.

    mf

    It's just mind boggling to me that so much energy is wasted, so much pressure is put on the environment, for absolutely no clear benefit at all.

    Closing words

    So in short, I think that cryptocurrencies provide nothing of value to society. They do however facilitate crime and contribute to climate change.

    Therefore, I would propose to shut them all down.

    The complex technology behind the cryptocurrencies attracted a lot of otherwise smart people and I think it's a sad thing to see their efforts going to waste or have a negative impact.

    People are not obliged to work on something valuable, but at least may I ask that they choose to work on something that won't harm our society?

    Link to hackernews, where this post was quickly flagged down. The few comments that exist don't seem to really provide any answers to the question I pose.


    1. I would definitely recommend reading this long-form-article by The New York Times.  

    2. Unless you want to embark on a path of criminal activity. 

    3. As mining is no longer profitable for the larger community, the miners become a small concentrated group of entities controlling the currency, making the currencies more centralised. Furthermore, the cryptocurrency exchanges where you can convert the cryptocurrency into regular money, are centralised institutions backed by for-profit companies. And those companies have to abide by the law. They are under the influence of the government. 

    4. If you want to know more about what happend to Silk Road and it's founder - 'the Dread Pirate Roberts', I would recommend the book 'American Kingpin' by Nick Bilton. (no affiliate links) 

    Tagged as : None
  2. Understanding Storage Performance - IOPS and Latency

    March 21, 2020

    Introduction

    The goal of this blogpost is to help you better understand storage performance. I want to discuss some fundamentals that are true regardless of your particular needs.

    This will help you better reason about storage and may provide a scaffolding for further learning.

    If you run your applications / workloads entirely in the cloud, this information may feel antiquated or irrelevant.

    However, since the cloud is just somebody else's compute and storage, knowledge about storage may still be relevant. Cloud providers expose storage performance metrics for you to monitor and this may help to make sense of them.

    Concepts

    I/O

    An I/O is a single read/write request. That I/O is issued to a storage medium (like a hard drive or solid state drive).

    It can be a request to read a particular file from disk. Or it can be a request to write some data to an existing file. Reading or writing a file can result in multiple I/O requests.

    I/O Request Size

    The I/O request has a size. The request can be small (like 1 Kilobyte) or large (several megabytes). Different application workloads will issue I/O operations with different request sizes. The I/O request size can impact latency and IOPS figures (two metrics we will discuss shortly).

    IOPS

    IOPS stands for I/O Operations Per Second. It is a performance metric that is used (and abused) a lot in the world of storage. It tells us how many I/O requests per second can be handled by the storage (for a particular workload).

    Warning: this metric is meaningless without a latency figure. We will discuss latency shortly.

    Bandwidth or throughput

    If you multiply the IOPS figure with the (average) I/O request size, you get the bandwidth or throughput. We state storage bandwidth mostly in Megabytes and Gigabytes per second.

    To give you an example: if we issue a workload of 1000 IOPS with a request size of 4 Kilobytes, we will get a throughput of 1000 x 4 KB = 4000 KB. This is about ~4 Megabytes per second.

    Latency

    Latency is the time it takes for the I/O request to be completed. We start our measurement from the moment the request is issued to the storage layer and stop measuring when either we get the requested data, or get confirmation that the data is stored on disk.

    Latency is the single most important metric to focus on when it comes to storage performance, under most circumstances.

    For hard drives, an average latency somewhere between 10 to 20 ms is considered acceptable (20 ms is the upper limit).

    For solid state drives, depending on the workload it should never reach higher than 1-3 ms. In most cases, workloads will experience less than 1ms latency numbers.

    IOPS and Latency

    This is a very important concept to understand. The IOPS metric is meaningless without a statement about latency. You must understand how long each I/O operation will take because latency dictates the responsiveness of individual I/O operations.

    If a storage solution can reach 10,000 IOPS but only at an average latency of 50 ms that could result in very bad application performance. If we want to hit an upper latency target of 10 ms the storage solution may only be capable of 2,000 IOPS.

    For more details on this topic I would recommend this blog and this blog.

    Access Patterns

    Sequential access

    An example of a sequential data transfer is copying a large file from one hard drive to another. A large number of sequential (often adjacent) datablocks is read from the source drive and written to another drive. Backup jobs also cause sequential access patterns.

    In practice this access pattern shows the highest possible throughput.

    Hard drives have it easy as they don't have to spend much time moving their read/write heads and can spend most time reading / writing the actual data.

    Random access

    I/O requests are issued in a seemingly random pattern to the storage media. The data could be stored all over various regions on the storage media. An example of such an access pattern is a heavy utilised database server or a virtualisation host running a lot of virtual machines (all operating simultaneously).

    Hard drives will have to spend a lot of time moving their read/write heads and can only spend little time transferring data. Both throughput and IOPS will plummet (as compared to a sequential access pattern).

    In practice, most common workloads, such as running databases or virtual machines, cause random access patterns on the storage system.

    Queue depth

    The queue depth is a number between 1 and ~128 that shows how many I/O requests are queued (in-flight) on average. Having a queue is beneficial as the requests in the queue can be submitted to the storage subsystem in an optimised manner and often in parallel. A queue improves performance at the cost of latency.

    If you have some kind of storage performance monitoring solution in place, a high queue depth could be an indication that the storage subsystem cannot handle the workload. You may also observe higher than normal latency figures. As long as latency figures are still within tolerable limits, there may be no problem.

    Storage Media Performance characteristics

    Hard drives

    Hard drives (HDDs) are mechanical devices that resemble a record player. They have an arm with a read/write head and the data is stored on (multiple) platters. hd01

    Hard drives have to physically move read/write heads to fulfil read/write requests. This mechanical nature makes them relatively slow as compared to solid state drives (which we will cover shortly).

    Especially random access workloads cause hard drives to spend a lot of time on moving the read/write head to the right position at the right time, so less time is available for actual data transfers.

    The most important thing to know about hard drives is that from a performance perspective (focussing on latency) higher spindle speeds reduce the average latency.

    Rotational Speed (RPM)Access Latency(ms)IOPS
    5400 17-18 50-60
    7200 12-13 75-85
    10,000 7-8 120-130
    15,000 5-6 150-180

    Because the latency of individual I/O requests is lower the drives with a higher RPM, you can issue more of such requests in the same amount of time. That's why the IOPS figure also increases.

    Latency and IOPS of an older Western Digital Velociraptor 10,000 RPM drive:

    wd01 Notice the latency and IOPS in the Queue Depth = 1 column.

    Source used to validate my own research.

    Regarding sequential throughput we can state that fairly old hard drives can sustain throughputs of 100-150 megabytes per second. More modern hard drives with higher capacities can often sustain between 200 - 270 megabytes per second.

    An important note: sequential transfer speeds are not constant and depend on the physical location of the data on the hard drive platters. As drive fill up, throughput diminishes. Throughput can drop more than fifty percent! 1.

    Solid State Drives

    Solid state drives (SSDs) have no moving parts, they are based on flash memory (chips). SSDs can handle I/O much faster and thus show significantly lower latency.

    ssd001

    Whereas we measure the average I/O latency of HDDs in milliseconds (a thousand of a second) we measure the latency of SSD I/O operations in microseconds (a millionth of a second).

    Because of this reduced latency per I/O request, SSDs outperform HDDs in every conceivable way. Even a cheap consumer SSD can at least sustain about 5000+ IOPS with only a 0.15 millisecond (150 nanosecond) latency. That latency is about 40x better than the best latency of an enterprise 15K RPM hard drive.

    Solid state drives can often handle I/O requests in parallel. This means that larger queue depths with more I/O requests in flight can show significantly higher IOPS with a limited (but not insignificant) increase in latency.

    ssd01 The performance of an older SATA consumer SSD

    More modern enterprise SSDs show better latency and IOPS. The SATA interface seems the main bottleneck.

    ssd02 The performance of an enterprise SATA SSD

    SSDs perform better than HDDs across all relevant metrics except price in relation to capacity.

    Important note: SSDs are not well-suited for archival storage of data. Data is stored as charges in the chips and those charges can diminish over time. It's expected that even hard drives are better suited for offline archival purposes although the most suitable storage method would probably be tape.

    SSD actual performance vs advertised performance

    Many SSDs are advertised with performance figures of 80,000 - 100,000 IOPS at some decent latency. Depending on the workload, you may only observe a fraction of that performance.

    Most of those high 80K-100K IOPS figures are obtained by benchmarking with very high queue depths (16-32). The SSD benefits from such queue depths because it can handle a lot of those I/O requests in parallel.

    Please beware: if your workload doesn't fit in that pattern, you may see lower performance numbers.

    If we take a look at the chart above of the Intel SSD we may notice how the IOPS figures only start to come close to the advertised 80K+ IOPS as the queue depth increases. It's therefore important to understand the characteristics of your own workload.

    RAID

    If we group several hard drives together we can create a RAID array. A RAID array is a virtual storage device that exceeds the capacity and performance of a single hard drive. This allows storage to scale within the limits of a single computer.

    RAID is also used (or some say primarily used) to assure availability by assuring redundancy (drive failure won't cause data loss). But for this article we focus it's performance characteristics.

    SSDs can achieve impressive sequential throughput speeds, of multiple gigabytes per second. Individual hard drives can never come close to those speeds, but if you put a lot of them together in a RAID array, you can come very close. For instance, my own NAS an achieve such speeds using 24 drives.

    RAID also improves the performance of random access patterns. The hard drives in a RAID array work in tandem to service those I/O requests so a RAID array shows significantly higher IOPS than a single drive. More drives means more IOPS.

    The picture below shows the read IOPS performance of an 8-drive RAID 5 array of 1 TB, 7200 RPM drives. We run a benchmark of random 4K read requests.

    Notice how the IOPS increases as the queue depth increases.

    raidiops

    However, nothing is free in this world. A higher queue depth - which acts as a buffer - does increase latency.

    raidlat

    It makes sense to put SSDs in RAID. Although they are more reliable than hard drives, they can fail. If you care about availability, RAID is inevitable. Furthermore, you can observe the same benefits as with hard drives: you pool resources together, achieving higher IOPS figures and more capacity than possible with a single SSD.

    Capacity vs. Performance

    The following is mostly focussed on hard drives although it could be true for solid state drives as well.

    We put hard drives in RAID arrays to get more IOPS than a single drive can provide. At some point - as the workload increases - we may hit the maximum number of IOPS the RAID array can sustain with an acceptable latency.

    This IOPS/Latency threshold could be reached even if we have only 50% of the storage capacity of our RAID array in use. If we use the RAID array to host virtual machines for instance, we cannot add more virtual machines because this would cause the latency to rise to unacceptable levels.

    It may feel like a lot of good storage space is going to waste, and in some sense this may be true. For this reason, it could be a wise strategy to buy smaller 10,000 RPM or 15,000 RPM drives purely for the IOPS they can provide and forgo on capacity.

    So it might be the case that you may have to order and add let's say 10 more hard drives to meet the IOPS/Latency demands while there's still plenty of space left.

    This kind of situation is less likely as SSDs have taken over the role of the performance storage layer and (larger capacity) hard drives are pushed in the role of 'online' archival storage.

    Closing words

    I hope this article has given you a better understanding of storage performance. Although it is just an introduction, it may help you to better understand the challenges of storage performance.


    1. https://en.wikipedia.org/wiki/Hard_disk_drive_performance_characteristics#Data_transfer_rate 

    Tagged as : storage
  3. Difference of Behavior in SATA Solid State Drives

    January 29, 2020

    Introduction


    Update: I've noticed some strange behavior of SSDs when benchmarking them with FIO. After further investigation and additional testing, I've found the reason for the strange patterns in the graphs.

    The 'strange' test results are due to the fact that they were obtained by connecting the SSDs to a P420I controller. As the HBA mode of this controller performs worse than the RAID mode, I used the RAID mode of this controller. Indvidual drives were put in a RAID0 volume. But it turns out that this creates a strange interaction between RAID controller and SSD.

    Additional testing with an SATA 300 AHCI controller shows 'normal' patterns that look similar to the results of the INTEL SSD as compared to the other ones (Samsung and Kingston).

    It seems I've made a mistake by using the P420i controller for testing. I have includes both 'bad' and 'good' results.


    Regular SATA solid state drives may seem interchangeable at this point. They all show amazing IOPS and latency performance.

    I have performed benchmarks on different SSDs from different vendors and it seems that they actually show very different behaviour. This behavior has come to light because I benchmarked the entire device capacity.

    The benchmark - performed with FIO - puts a fifty percent read/write random 4K workload on the device. The benchmark stops when all sectors of the device have been read or written to. Furthermore, all tests are performed with a queue depth of 1.

    I've made this post because I found the results interesting. At least the images show a very peculiar pattern for some SSDs. I can't explain them really, maybe you can.

    This is the test I ran against the SSDs.

    fio --filename=/dev/sdX --direct=1 --rw=randrw --refill_buffers
    --norandommap --ioengine=libaio --bs=4k --rwmixread=50 --iodepth=1
    

    Disclaimer

    I've performed these benchmark to the best of my knowledge. The raw benchmark data is available here

    It's always possible that I made a mistake, so it may be wise to run your own tests to see if you can replicate these results.

    Caveat: I really don't know if these benchmark results impact real-life performance. Maybe these benchmark results show a kind of behaviour of SSDs that doesn't really matter in the end.

    Benchmark Results

    Intel D3-S4610

    This SSD is meant for for datacenter usage. This is the test result on the P420i controller.

    intel

    IOPS and Latency is consistent during the whole benchmark. It's behaviour seems predictable.

    This is the test result on the AHCI controller:

    intelahci

    Samsung 860 Pro

    This SSD is meant for desktop usage. Its behavior seems quite different from the Intel SSD. I have separated the IOPS data from the Latency data to make the graphs more eligible.

    This is the test result on the P420i controller.

    IOPS

    samsung860iops

    Latency

    samsung860latency

    The best-case latency is almost four times better than the worst-case latency. Latency is thus less predictable. This impact also seems to be reflected in the IOPs numbers.

    This is the test result on the AHCI controller:

    samsung860ahci

    Samsung PM883

    This SSD is meant for datacenter usage. This is the test result on the P420i controller.

    IOPS

    samsungpm883iops

    Latency

    samsungpm883latency

    This SSD seems to behave in a similar way as the 860 PRO.

    This is the test result on the AHCI controller:

    samsungpm8832ahci

    Kingston DC500M

    This SSD is meant for datacenter usage. This is the test result on the P420i controller.

    IOPS

    kingstondc500miops

    latency

    kingstondc500mlatency

    The behavior of this SSD seems similar to the behaviour of the Samsung SSDs but the pattern is distinct: it seems shifted as compared to the Samsung SSDs.

    This is the test result on the AHCI controller:

    kingstondc500mahci

    Evaluation


    Updated evaluation

    We can conclude that the P420i RAID controller causes strange behavior not observed when we test the SSDs on a regular AHCI controller. Although this was an older SATA 300 controller, I'm making the assumption that this controller still has enough bandwidth to support a random 4K test as most tests never went beyond 50+ MB/s of throughput.


    At this point, I can only say that I observe quite different behavior between the Intel SSD and the other SSDs from Samsung and Kingston. The problem is that I can't tell if this affects real-life day-to-day application performance.

    It seems that although results for the Samsung and Kingston SSDs fluctuate quite a bit, it's quite possible that the fluctuations occur during a very short timespan and effectively cancel each other out.

    If you have comments, ideas or suggestions, leave a comment below.

    How are these images generated?

    All images have been generated with fio-plot.

    The github repository also contains a folder with a lot of example images.

    Tagged as : storage

Page 1 / 66