cache – Hackaday https://hackaday.com Fresh hacks every day Tue, 09 Dec 2025 09:05:09 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 156670177 Linux Fu: The SSD Super Cache https://hackaday.com/2025/12/08/linux-fu-the-ssd-super-cache/ https://hackaday.com/2025/12/08/linux-fu-the-ssd-super-cache/#comments Mon, 08 Dec 2025 18:00:12 +0000 https://hackaday.com/?p=881264 NVMe solid state disk drives have become inexpensive unless you want the very largest sizes. But how do you get the most out of one? There are two basic strategies: …read more]]>

NVMe solid state disk drives have become inexpensive unless you want the very largest sizes. But how do you get the most out of one? There are two basic strategies: you can use the drive as a fast drive for things you use a lot, or you can use it to cache a slower drive.

Each method has advantages and disadvantages. If you have an existing system, moving high-traffic directories over to SSD requires a bind mount or, at least, a symbolic link. If your main filesystem uses RAID, for example, then those files are no longer protected.

Caching sounds good, in theory, but there are at least two issues. You generally have to choose whether your cache “writes through”, which means that writes will be slow because you have to write to the cache and the underlying disk each time, or whether you will “write back”, allowing the cache to flush to disk occasionally. The problem is, if the system crashes or the cache fails between writes, you will lose data.

Compromise

For some time, I’ve adopted a hybrid approach. I have an LVM cache for most of my SSD that hides the terrible performance of my root drive’s RAID array. However, I have some selected high-traffic, low-importance files in specific SSD directories that I either bind-mount or symlink into the main directory tree. In addition, I have as much as I can in tmpfs, a RAM drive, so things like /tmp don’t hit the disks at all.

There are plenty of ways to get SSD caching on Linux, and I won’t explain any particular one. I’ve used several, but I’ve wound up on the LVM caching because it requires the least odd stuff and seems to work well enough.

This arrangement worked just fine and gives you the best of both worlds. Things like /var/log and /var/spool are super fast and don’t bog down the main disk. Yet the main disk is secure and much faster thanks to the cache setup. That’s been going on for a number of years until recently.

The Upgrade Issue

I recently decided to give up using KDE Neon on my main desktop computer and switch to OpenSUSE Tumbleweed, which is a story in itself. The hybrid caching scheme seemed to work, but in reality, it was subtly broken. The reason? SELinux.

Tumbleweed uses SELinux as a second level of access protection. On vanilla Linux, you have a user and a group. Files have permissions for a specific user, a specific group, and everyone else. Permission, in general, means if a given user or group member can read, write, or execute the file.

SELinux adds much more granularity to protection. You can create rules that, for example, allow certain processes to write to a directory but not read from it. This post, though, isn’t about SELinux fundamentals. If you want a detailed deep dive from Red Hat, check out the video below.

The Problem

The problem is that when you put files in SSD and then overlay them, they live in two different places. If you tell SELinux to “relabel” files — that is, put them back to their system-defined permissions, there is a chance it will see something like /SSD/var/log/syslog and not realize that this is really the same file as /var/log. Once you get the wrong label on a system file like that, bad, unpredictable things happen.

There is a way to set up an “equivalence rule” in SELinux, but there’s a catch. At first, I had the SSD mounted at /usr/local/FAST. So, for example, I would have /usr/local/FAST/var/log. When you try to equate /usr/local/FAST/var to /usr/var, you run into a problem. There is already a rule that /usr and /usr/local are the same. So you have difficulties getting it to understand that throws a wrench in the works.

There are probably several ways to solve this, but I took the easy way out: I remounted to /FAST. Then it was easy enough to create rules for /var/log to /FAST/var/log, and so on. To create an equivalence, you enter:


semanage fcontext -a -e /var/log /FAST/var/log

The Final Answer

So what did I wind up with? Here’s my current /etc/fstab:


UUID=6baad408-2979-2222-1010-9e65151e07be /              ext4    defaults,lazytime,commit=300 0 1
tmpfs                                     /tmp           tmpfs   mode=1777,nosuid,nodev 0 0
UUID=cec30235-3a3a-4705-885e-a699e9ed3064 /boot          ext4    defaults,lazytime,commit=300,inode_readahead_blks=64 0 2
UUID=ABE5-BDA4                            /boot/efi      vfat    defaults,lazytime 0 2
tmpfs                                       /var/tmp    tmpfs  rw,nosuid,nodev,noexec,mode=1777 0 0

<h1>NVMe fast tiers</h1>

UUID=c71ad166-c251-47dd-804a-05feb57e37f1 /FAST  ext4  defaults,noatime,lazytime  0  2
/FAST/var/log /var/log  none  bind,x-systemd.requires-mounts-for=/FAST 0 0
/FAST/usr/lib/sysimage/rpm /usr/lib/sysimage/rpm none bind,x-systemd.requires-mounts-for=/FAST 0 0
/FAST/var/spool /var/spool  none  bind,x-systemd.requires-mounts-for=/FAST 0 0

As for the SELinux rules:


/FAST/var/log = /var/log
/FAST/var/spool = /var/spool
/FAST/alw/.cache = /home/alw/.cache
/FAST/usr/lib/sysimage/rpm = /usr/lib/sysimage/rpm
/FAST/alw/.config = /home/alw/.config
/FAST/alw/.zen = /home/alw/.zen

Note that some of these don’t appear in /etc/fstab because they are symlinks.

A good rule of thumb is that if you ask SELinux to relabel the tree in the “real” location, it shouldn’t change anything (once everything is set up). If you see many changes, you probably have a problem:


restorecon -Rv /FAST/var/log

Worth It?

Was it worth it? I can certainly feel the difference in the system when I don’t have this setup, especially without the cache. The noisy drives quiet down nicely when most of the normal working set is wholly enclosed in the cache.

This setup has worked well for many years, and the only really big issue was the introduction of SELinux. Of course, for my purposes, I could probably just disable SELinux. But it does make sense to keep it on if you can manage it.

If you have recently switched on SELinux, it is useful to keep an eye on:


ausearch -m AVC -ts recent

That shows you if SELinux denied any access recently. Another useful command:


systemctl status setroubleshootd.service

Another good systemdstupid trick.” Often, any mysterious issues will show up in one of those two places. If you are on a single-user desktop, it isn’t a bad idea to retry any strange anomalies with SELinux turned off as a test: setenforce 0. If the problem goes away, it is a sure bet that something is wrong with the SELinux system.

Of course, every situation is different. If you don’t need RAID or a huge amount of storage, maybe just use an SSD as your root system and be done with it. That would certainly be easier. But, in typical Linux fashion, you can make of it whatever you want. We like that.

]]>
https://hackaday.com/2025/12/08/linux-fu-the-ssd-super-cache/feed/ 14 881264 LinuxFu
Dreamcast Homebrew Gets Boost from SD Card Cache https://hackaday.com/2021/06/04/dreamcast-homebrew-gets-boost-from-sd-card-cache/ https://hackaday.com/2021/06/04/dreamcast-homebrew-gets-boost-from-sd-card-cache/#comments Sat, 05 Jun 2021 05:00:34 +0000 https://hackaday.com/?p=480841 While it might have been a commercial failure compared to contemporary consoles, the Sega Dreamcast still enjoys an active homebrew scene more than twenty years after its release. Partly it’s …read more]]>

While it might have been a commercial failure compared to contemporary consoles, the Sega Dreamcast still enjoys an active homebrew scene more than twenty years after its release. Partly it’s due to the fact that you can burn playable Dreamcast discs on standard CD-Rs, but fans of the system will also point out that the machine was clearly ahead of its time in many respects, affording it a bit of extra goodwill in the community.

That same community happens to be buzzing right now with news that well-known Dreamcast hacker [Ian Micheal] has figured out how to cache data to an SD card via the console’s serial port. At roughly 600 KB/s the interface is too slow to use it as swap space for expanding the system’s paltry 16 MB of memory, but it’s more than fast enough to load game assets which otherwise would have had to be loaded into RAM.

A third-party Dreamcast SD adapter.

In the video below, [Ian] shows off his new technique with a port of DOOM running at 640×480. He’s already seeing an improvement to framerates, and thinks further optimizations should allow for a solid 30 FPS, but that’s not really the most exciting part. With the ability to load an essentially unlimited amount of data from the SD card while the game is running, this opens the possibility of running mods which wouldn’t have been possible otherwise. It should also allow for niceties like saving screenshots or game progress to the SD card for easy retrieval.

[Ian] says he’ll be bringing the same technique to his Dreamcast ports of Quake and Hexen in the near future, and plans on posting some code to GitHub that demonstrates reading and writing to FAT32 cards so other developers can get in on the fun. The downside is that you obviously need to have an SD card adapter plugged into your console to make use of this technique, which not everyone will have. Luckily they’re fairly cheap right now, but we wouldn’t be surprised if the prices start climbing. If you don’t have one already, now’s probably the time to get one.

To be clear, this technique is completely separate from replacing the Dreamcast’s optical drive with an SD card, which itself is a very popular modification that’s helped keep Sega’s last home console kicking far longer than anyone could have imagined.

[Thanks to cass for the tip.]

]]>
https://hackaday.com/2021/06/04/dreamcast-homebrew-gets-boost-from-sd-card-cache/feed/ 10 480841 dreamcast-ram-featured
Sushi Roll Helps Inspect Your CPU Internals https://hackaday.com/2019/08/21/sushi-roll-helps-inspect-your-cpu-internals/ https://hackaday.com/2019/08/21/sushi-roll-helps-inspect-your-cpu-internals/#comments Wed, 21 Aug 2019 08:00:00 +0000 https://hackaday.com/?p=372722 [Gamozolabs’] post about Sushi Roll — a research kernel for monitoring Intel CPU internals — is pretty long. While we were disappointed at the end that the kernel’s source is …read more]]>

[Gamozolabs’] post about Sushi Roll — a research kernel for monitoring Intel CPU internals — is pretty long. While we were disappointed at the end that the kernel’s source is not exactly available due to “sensitive features”, we were so impressed with the description of the modern x86 architecture and some of the work done with Sushi Roll, that we just had to post it. If the post gets you wanting to actually try some of this, you can check out another [Gamozolabs] creation, Orange Slice.

While you probably know that a modern Intel CPU bears little resemblance to the old 8086 processor it emulates, it is surprising, sometimes, to realize just how far it has gone. The very first thing the CPU does is to break your instruction up into microoperations. The execution engine uses some sophisticated techniques for register renaming and scheduling that allow you to run instructions out of order and to run more than one instruction per clock cycle.

The purpose of Sushi Roll is to reduce uncertainty in timing so that measurements can reveal short microoperation durations. The kernel does not use locking, nor does it use interrupts, timers, threads, or processes. This allows code to run without a lot of extraneous things affecting timing like cache evictions or interrupts. Combined with the Intel performance monitoring registers allows you to make some very specific measurements.

Like we said, we were sorry you can’t get the kernel source to do your own measurements. However, the work is impressive and the background information is still a good read, too.

A lot of this internal trivia seemed unimportant until it became the subject of security exploits. We just can’t get enough of CPU internals.

]]>
https://hackaday.com/2019/08/21/sushi-roll-helps-inspect-your-cpu-internals/feed/ 8 372722 cpu
Caching In on Program Performance https://hackaday.com/2019/06/23/caching-in-on-program-performance/ https://hackaday.com/2019/06/23/caching-in-on-program-performance/#respond Mon, 24 Jun 2019 02:00:00 +0000 https://hackaday.com/?p=363962 Most of us have a pretty simple model of how a computer works. The CPU fetches instructions and data from memory, executes them, and writes data back to memory. That …read more]]>

Most of us have a pretty simple model of how a computer works. The CPU fetches instructions and data from memory, executes them, and writes data back to memory. That model is a good enough abstraction for most of what we do, but it hasn’t really been true for a long time on anything but the simplest computers. A modern computer’s memory subsystem is much more complex and often is the key to unlocking real performance. [Pdziepak] has a great post about how to take practical advantage of modern caching to improve high-performance code.

If you go back to 1956, [Tom Kilburn’s] Atlas computer introduced virtual memory based on the work of a doctoral thesis by [Fritz-Rudolf Güntsch]. The idea is that a small amount of high-speed memory holds pieces of a larger memory device like a memory drum, tape, or disk. If a program accesses a piece of memory that is not in the high-speed memory, the system reads from the mass storage device, after possibly making room by writing some part of working memory back out to the mass storage device.

Caching takes this even further. The CPU executes code from a small but very fast cache. A larger and slower cache acts as mass storage for the fast cache. That cache may have its own cache until eventually one of the caches empties into a mass storage device. Naturally, there are some differences since the purpose is different: cache is mainly concerned with faster memory access while virtual memory tries to allow large programs to run in less physical memory.

However, this is a lot different than our common mental model. In a very real sense, today’s modern CPUs execute programs from mass storage. That’s why you can have many huge programs running on a single computer with limited memory. However, the CPU really executes from a very small high-speed memory.

A modern cache is often split into separate parts for instruction and data, and [Pdziepak] is looking specifically at the level 1 instruction cache. It gets pretty detailed, but it does talk about tools to examine cache performance and also about hot and cold functions, something we don’t think gets enough use.

Of course, if you are just writing normal code, you probably don’t care. But if you are trying to wring the most performance you can get out of your CPU, you’ll enjoy the post.

Unfortunately, the cache has had a bad security rep lately. Although Meltdown and Spectre got most of the press, there’s also Foreshadow.

]]>
https://hackaday.com/2019/06/23/caching-in-on-program-performance/feed/ 0 363962 mem
Spectre and Meltdown: How Cache Works https://hackaday.com/2018/01/15/spectre-and-meltdown-how-cache-works/ https://hackaday.com/2018/01/15/spectre-and-meltdown-how-cache-works/#comments Mon, 15 Jan 2018 18:01:28 +0000 http://hackaday.com/?p=290125 The year so far has been filled with news of Spectre and Meltdown. These exploits take advantage of features like speculative execution, and memory access timing. What they have in …read more]]>

The year so far has been filled with news of Spectre and Meltdown. These exploits take advantage of features like speculative execution, and memory access timing. What they have in common is the fact that all modern processors use cache to access memory faster. We’ve all heard of cache, but what exactly is it, and how does it allow our computers to run faster?

In the simplest terms, cache is a fast memory. Computers have two storage systems: primary storage (RAM) and secondary storage (Hard Disk, SSD). From the processor’s point of view, loading data or instructions from RAM is slow — the CPU has to wait and do nothing for 100 cycles or more while the data is loaded. Loading from disk is even slower; millions of cycles are wasted. Cache is a small amount of very fast memory which is used to hold commonly accessed data and instructions. This means the processor only has to wait for the cache to be loaded once. After that, the data is accessible with no waiting.

A common (though aging) analogy for cache uses books to represent data: If you needed a specific book to look up an important piece of information, you would first check the books on your desk (cache memory). If your book isn’t there, you’d then go to the books on your shelves (RAM). If that search turned up empty, you’d head over to the local library (Hard Drive) and check out the book. Once back home, you would keep the book on your desk for quick reference — not immediately return it to the library shelves. This is how cache reading works.

Intel Haswell diagram. Note how much real estate is used by the L3 cache and the Memory Controller.

Cache is Expensive Real Estate

Early computers ran so slowly that cache really didn’t matter. Core memory was plenty fast when CPU speed was measured in kHz. The first data cache was used in the IBM System/360 Model 85, released in 1968. IBM’s documentation claimed memory operations would take ¼ to ⅓ less time compared to a system without data cache.

Now the obvious question is if cache is faster, why not make all memory out of cache? There are two answers to that. First, cache is more expensive than main memory. Generally, cache is built with static ram, which is much more expensive than the dynamic ram used in main memory. The second answer is location, location, location. With processors running at several GHz, cache now needs to be on the same piece of silicon with the processor itself. Sending the signals out over PCB traces would take too long.

The CPU core generally doesn’t know or care about the cache. All the housekeeping and cache management is handled by the Memory Management Unit (MMU) or cache controller. These are complex logic systems that need to operate quickly to keep the CPU loaded with data and instructions.

Direct Mapped Cache

The simplest form of a cache is called direct mapped cache. Direct mapped means that every memory location maps to just one cache location. In the diagram, there are 4 cache slots.This means that Cache index 0 might hold memory index 0,4,8 and so on. Since one cache block can hold any one of multiple memory locations, the cache controller needs a way to know which memory is actually in the cache. To handle this, every cache block has a tag, which holds the upper bits of the memory address currently stored in the cache.

The cache controller also needs to know if the cache is actually holding usable data or garbage. When the processor first starts up, all the cache locations will be random. It would be really bad if some of this random data were accidentally used in a calculation. This is handled with a valid bit. If the valid bit is set to 0, then the cache location contains invalid data and is ignored. If the bit is 1, then the cache data is valid, and the cache controller will use it. The valid bit is forced to zero at processor power up and can be cleared whenever the cache controller needs to invalidate the cache.

Writing to Cache

So far I’ve covered read accesses using cache, but what about writes? The cache controller needs to be sure that any changes made to memory in the cache are also made in RAM before the cache gets overwritten by some new memory access. There are two basic ways to do this. First is to write memory every time the cache is written. This is called write-through cache. Write-through is safe but comes with a speed penalty. Reads are fast but writes happen at the slower speed of main memory. The more popular system is called write-back cache. In a write-back system, changes are stored in the cache until that cache location is about to be overwritten. The cache controller needs to know if this has happened, so one more bit called the dirty bit is added to the cache.

Cache data needs all this housekeeping data — the tag, the valid bit, the dirty bit — stored in high-speed cache memory, which increases the overall cost of the cache system.

Associative Cache

A direct mapped cache will speed up memory accesses, but the one-to-one mapping of memory to cache is relatively inefficient. It is much more efficient to allow data and instructions to be stored in more than one location.

A fully associative cache allows just that — any memory location can be stored in any cache location. This also means the cache must be searched on every memory access. A search function like this is done with a hardware comparator. A comparator like this would require lots of logic gates, and eat up a large physical area on the chip. The happy medium is a set associative cache. The diagram shows a 2 way set associative cache. This means any memory location can go in one of two cache locations. That keeps the hardware comparator relatively simple and fast. As an example, the Intel Core i7-8700K Level 3 cache is 16-way set associative.

Virtual Memory

In the beginning, computers ran one program at a time. A programmer had access to the entire memory of the system and had to manage how that memory was utilized. If his or her program was larger than the entire RAM space of the system, it would be up to the programmer to swap sections in or out as needed. Imagine having to check if printf() is loaded into memory every time you want to print something.

Virtual memory is the way an operating system works directly with the processor MMU to deal with this. Main Memory is broken up into pages. Each page is swapped in as it is needed. With cheap memory these days, we don’t have to worry as much about swapping out to disk. However, virtual memory is still vitally important for some of the other advantages it offers.

Virtual memory allows the operating system to run many programs at once — each program has its own virtual address space, which is then mapped to pages in physical memory. This mapping is stored in the page table, which itself lives in RAM. The page table is so important that it gets its own special cache called the Translation Look-aside Buffer, or TLB.

The MMU and virtual memory hardware also work with the operating system to enforce memory protection — don’t let programs read or modify each other’s memory space, and don’t let anyone mess with the kernel. This is the protection that is sidestepped by the Spectre and Meltdown attacks.

Into the real world

As you can see, even relatively simple cache systems can be hard to follow. In a modern processor like the Intel i7-8700K, there are multiple layers of caches, some independent, some shared between the CPU cores. There also are the speculative execution engines, which pull data that might be used from memory into cache before a given core is even ready for it. That engine is the key to Meltdown.

Both Spectre and Meltdown use cache in a timing based attack. Since cached memory is much faster to access, an attacker can measure access time to determine if memory is coming from RAM or the cache. That timing information can then be used to actually read out the data in the memory. This why a Javascript patch was pushed to browsers two weeks ago. That patch makes the built-in Javascript timing features a tiny bit less accurate, just enough to make them worthless in measuring memory access time which safeguards against browser-based exploits for these vulnerabilities.

The simple knee-jerk method of trying to mitigate Spectre and Meltdown is to disable the caches. This would make our computer systems incredibly slow compared to the speeds we are used to. The patches being rolled out are not nearly this extreme, but they do change the way the CPU works with the cache — especially upon user space to kernel space context switches.

Modern CPU datapath and cache design is an incredibly complex task. Processor manufacturers have amassed years of data by simulating and statistically analyzing how processors move data to make systems both fast and reliable. Changes to something like processor microcode are only made when absolutely necessary, and only after thousands of hours of testing. A shotgun change made in a rush (yes, six months is a rush) like the current Intel and AMD patches is sure to have some problems — and that is exactly what we’re seeing with big performance hits and crashes. The software side of the fix such as Kernel Page Table Isolation can force flushes of the TLBs, which also come with big performance hits for applications which make frequent calls to the operating system. This explains some of the reason why the impact of the changes is so application dependent. In essence, we’re all beta testers.

There is a bright side though. The hope is that with more time for research and testing on the part of the chip manufactures and software vendors, the necessary changes will be better understood, and better patches will be released.

That is, until the next attack vector is found.

]]>
https://hackaday.com/2018/01/15/spectre-and-meltdown-how-cache-works/feed/ 15 290125 Ivy-Bridge_Die_Label
Spectre and Meltdown: Attackers Always Have The Advantage https://hackaday.com/2018/01/10/spectre-and-meltdown-attackers-always-have-the-advantage/ https://hackaday.com/2018/01/10/spectre-and-meltdown-attackers-always-have-the-advantage/#comments Wed, 10 Jan 2018 18:00:46 +0000 http://hackaday.com/?p=289340 While the whole industry is scrambling on Spectre, Meltdown focused most of the spotlight on Intel and there is no shortage of outrage in Internet comments. Like many great discoveries, …read more]]>

While the whole industry is scrambling on Spectre, Meltdown focused most of the spotlight on Intel and there is no shortage of outrage in Internet comments. Like many great discoveries, this one is obvious with the power of hindsight. So much so that the spectrum of reactions have spanned an extreme range. From “It’s so obvious, Intel engineers must be idiots” to “It’s so obvious, Intel engineers must have known! They kept it from us in a conspiracy with the NSA!”

We won’t try to sway those who choose to believe in a conspiracy that’s simultaneously secret and obvious to everyone. However, as evidence of non-obviousness, some very smart people got remarkably close to the Meltdown effect last summer, without getting it all the way. [Trammel Hudson] did some digging and found a paper from the early 1990s (PDF) that warns of the dangers of fetching info into the cache that might cross priviledge boundaries, but it wasn’t weaponized until recently. In short, these are old vulnerabilities, but exploiting them was hard enough that it took twenty years to do it.

Building a new CPU is the work of a large team over several years. But they weren’t all working on the same thing for all that time. Any single feature would have been the work of a small team of engineers over a period of months. During development they fixed many problems we’ll never see. But at the end of the day, they are only human. They can be 99.9% perfect and that won’t be good enough, because once hardware is released into the world: it is open season on that 0.1% the team missed.

The odds are stacked in the attacker’s favor. The team on defense has a handful of people working a few months to protect against all known and yet-to-be discovered attacks. It is a tough match against the attackers coming afterwards: there are a lot more of them, they’re continually refining the state of the art, they have twenty years to work on a problem if they need to, and they only need to find a single flaw to win. In that light, exploits like Spectre and Meltdown will probably always be with us.

Let’s look at some factors that paved the way to Intel’s current embarrassing situation.

In Intel’s x86 lineage of processors, the Pentium Pro in 1995 was first to perform speculative execution. It was the high-end offering for demanding roles like multi-user servers, so it had to keep low-privilege users’ applications from running wild. But the design only accounted for direct methods of access. The general concept of side-channel attacks were well-established by that time in the analog world but it hadn’t yet been proven applicable to the digital world. For instance, one of the groundbreaking papers in side-channel attacks, pulling encryption keys out of certain cryptography algorithm implementations, was not published until a year after the Pentium Pro came to market.

Computer security was a very different and a far smaller field in the 1990s. For one, Internet Explorer 6, the subject of many hard lessons in security, was not released until 2001. The growth of our global interconnected network would expand opportunities and fuel a tremendous growth in security research on offense and defense, but that was still years away. And in the early 1990s, software security was in such a horrible state that only a few researchers were looking into hardware.

The Need for Speed

During this time when more people were looking harder at more things, Intel’s never-ending quest for speed inadvertently made the vulnerability easier to exploit. Historically CPU performance advancements have outpaced those for memory, and their growing disparity was a drag on overall system performance. CPU memory caches were designed to help climb this “memory wall”, termed in a 1994 ACM paper. One Pentium Pro performance boost came from moving its L2 cache from the motherboard to its chip package. Later processors added a third level of cache, and eventually Intel integrated everything into a single piece of silicon. Each of these advances made cache access faster, but that also increased the time difference between reading cached and uncached data. On modern processors, this difference stands out clearly against the background noise, illustrated in this paper on Meltdown.

Flash-forward to today: timing attacks against cache memory have become very popular. Last year all the stars aligned as multiple teams independently examined how to employ the techniques against speculative execution. The acknowledgements credited Jann Horn of Google Project Zero as the first to notify Intel of Meltdown in June 2017, triggering investigation into how to handle a problem whose seeds were planted over twenty years ago.

This episode will be remembered as a milestone in computer security. It is a painful lesson with repercussions that will continue reverberating for some time. We have every right to hold industry-dominant Intel to high standards and put them under spotlight. We expect mitigation and fixes. The fundamental mismatch of fast processors that use slow memory will persist, so CPU design will evolve in response to these findings, and the state of the art will move forward. Both in how to find problems and how to respond to them, because there are certainly more flaws awaiting discovery.

So we can’t stop you if you want to keep calling Intel engineers idiots. But we think that the moral of this story is that there will always be exploits like these because attack is much easier than defense. The Intel engineers probably made what they thought was a reasonable security-versus-speed tradeoff back in the day. And given the state of play in 1995 and the fact that it took twenty years and some very clever hacking to weaponize this design flaw, we’d say they were probably right. Of course, now that the cat is out of the bag, it’s going to take even more cleverness to fix it up.

]]>
https://hackaday.com/2018/01/10/spectre-and-meltdown-attackers-always-have-the-advantage/feed/ 45 289340 Meltdown Inside 16x9
PCI I-RAM Working Without a PCI Slot https://hackaday.com/2015/01/18/pci-i-ram-working-without-a-pci-slot/ https://hackaday.com/2015/01/18/pci-i-ram-working-without-a-pci-slot/#comments Sun, 18 Jan 2015 21:00:34 +0000 http://hackaday.com/?p=143999 iram[Gnif] had a recent hard drive failure in his home server. When rebuilding his RAID array, he decided to update to the ZFS file system. While researching ZFS, [Gnif] learned that the …read more]]> iram

[Gnif] had a recent hard drive failure in his home server. When rebuilding his RAID array, he decided to update to the ZFS file system. While researching ZFS, [Gnif] learned that the file system allows for a small USB cache disk to greatly improve his disk performance. Since USB is rather slow, [Gnif] had an idea to try to use an old i-RAM PCI card instead.

The problem was that he didn’t have any free PCI slots left in his home server. It didn’t take long for [Gnif] to realize that the PCI card was only using the PCI slot for power. All of the data transfer is actually done via a SATA cable. [Gnif] decided that he could likely get by without an actual PCI slot with just a bit of hacking.

[Gnif] desoldered a PCI socket from an old faulty motherboard, losing half of the pins in the process. Luckily, the pins he needed still remained. [Gnif] knew that DDR memory can be very power-hungry. This meant that he couldn’t only solder one wire for each of the 3v, 5v, 12v, and ground pins. He had to connect all of them in order to share the current load. All in all, this ended up being about 20 pins. He later tested the current draw and found it reached as high as 1.2 amps, confirming his earlier decision. Finally, the reset pin needed to be pulled to 3.3V in order to make the disk accessible.

All of the wires from his adapter were run to Molex connectors. This allows [Gnif] to power the device from a computer power supply. All of the connections were covered in hot glue to prevent them from wriggling lose.

]]>
https://hackaday.com/2015/01/18/pci-i-ram-working-without-a-pci-slot/feed/ 38 143999 iram