Introduction to Side Channel Attacks using CacheWarp
Cloud CommuteAugust 02, 2024x
23
00:31:4329.05 MB

Introduction to Side Channel Attacks using CacheWarp

In this week's episode we have a very special guest, Michael Schwarz from CISPA. A security researcher specialized in CPU side-channel attacks. He explains how side-channel attacks work in general, but most specifically with the example of his team's most recent find: #CacheWarp. He was also involved in the finding of #Meltdown and #Spectre.

In this episode of Cloud Commute, Chris and Michael discuss:

  • Understanding side-channel attacks and their impact on cloud security
  • The role of CISPA in cybersecurity research and startup support
  • CacheWarp: A CPU vulnerability that enables privilege escalation
  • The future of hardware vulnerabilities and increasing complexity in CPUs

Interested to learn more about the cloud infrastructure stack like storage, security, and Kubernetes? Head to our website (www.simplyblock.io/cloud-commute-podcast) for more episodes, and follow us on LinkedIn (www.linkedin.com/company/simplyblock-io). You can also check out the detailed show notes on Youtube (www.youtube.com/watch?v=fLUSrcah2xE).

You can find Michael Schwarz on X @misc0110 and Linkedin: /michael-schwarz-5aaab720.

About simplyblock:

Simplyblock is an intelligent database storage orchestrator for IO-intensive workloads in Kubernetes, including databases and analytics solutions. It uses smart NVMe caching to speed up read I/O latency and queries. A single system connects local NVMe disks, GP3 volumes, and S3 making it easier to handle storage capacity and performance. With the benefits of thin provisioning, storage tiering, and volume pooling, your database workloads get better performance at lower cost without changes to existing AWS infrastructure.

👉 Get started with simplyblock: https://www.simplyblock.io/buy-now

🏪 simplyblock AWS Marketplace: https://aws.amazon.com/marketplace/seller-profile?id=seller-fzdtuccq3edzm


00:00:00
But then we started

00:00:00
looking at this

00:00:01
interaction, like, okay,

00:00:02
we built this super,

00:00:03
really cool stuff.

00:00:05
We have that in

00:00:05
the CPUs now, and

00:00:06
it's pretty new.

00:00:07
And on the other side,

00:00:08
we have instructions

00:00:09
in the CPU that

00:00:10
were introduced

00:00:11
in the beginning

00:00:12
of x86 and nobody

00:00:13
uses them anymore.

00:00:14
CPUs become more

00:00:15
and more complex.

00:00:17
At that point in time,

00:00:17
it was the predictor

00:00:19
making guesses

00:00:20
where to jump next.

00:00:21
Look outside your

00:00:22
window and you see

00:00:23
if the lights are on.

00:00:24
And if the lights are

00:00:24
on, you have a certain

00:00:26
probability that the

00:00:27
neighbor is at home.

00:00:28
The lovely noisy

00:00:29
neighbor problem

00:00:30
that everyone knows

00:00:31
from the cloud.

00:00:32
I love the example

00:00:33
with the neighbor.

00:00:33
That's just incredibly

00:00:35
visual for everyone

00:00:37
to understand.

00:00:37
So if this noise that

00:00:39
you're seeing depends

00:00:40
on some secrets, then

00:00:42
it gets interesting.

00:00:43
And this is what we do.

00:00:46
You're listening to

00:00:47
Simplyblock's Cloud

00:00:47
Commute Podcast,

00:00:49
your weekly 20

00:00:49
minute podcast about

00:00:51
cloud technologies,

00:00:52
Kubernetes, security,

00:00:53
sustainability,

00:00:54
and more.

00:00:57
Hello, everyone.

00:00:58
Welcome back to this

00:00:59
week's episode of

00:01:00
Simplyblock's Cloud

00:01:01
Community Podcast.

00:01:01
And as you know,

00:01:03
I have another

00:01:03
incredible guest.

00:01:04
I know I say this every

00:01:05
single time and it's

00:01:06
every single time true.

00:01:08
So, with me this week,

00:01:13
is Michael Schwarz,

00:01:16
fellow German.

00:01:18
Actually a security

00:01:19
researcher, with CISPA,

00:01:21
but he's gonna say a

00:01:23
few words about that.

00:01:24
So, welcome.

00:01:26
Welcome, Michael.

00:01:28
And maybe just introduce

00:01:29
yourself very quick.

00:01:31
Thanks for this

00:01:32
nice intro here.

00:01:34
Yes.

00:01:35
So as you already

00:01:36
said, I'm a

00:01:37
security researcher.

00:01:38
I'm working here

00:01:39
at CISPA, which

00:01:40
is a big research

00:01:42
center in Germany.

00:01:44
if you look at the

00:01:45
academic rankings,

00:01:47
one of the worldwide

00:01:48
leading institutes

00:01:49
for cybersecurity.

00:01:51
In my position

00:01:52
here, I'm leading

00:01:53
a research group.

00:01:54
I'm a faculty here.

00:01:55
So currently I have

00:01:57
six PhD students

00:01:59
that I advise and

00:02:01
additionally five

00:02:02
student helpers that

00:02:04
support me in the tasks.

00:02:05
We are a bit bigger,

00:02:09
not the biggest group,

00:02:10
but still a considerable

00:02:12
large group.

00:02:14
And we are working

00:02:16
on very specific

00:02:17
topics, I have to say.

00:02:18
It's also one of

00:02:19
the things I'm

00:02:19
talking about today.

00:02:21
We are working on kind

00:02:23
of all things related

00:02:25
to side channel attacks.

00:02:26
It's also a term that

00:02:28
we probably have to

00:02:29
introduce in the podcast

00:02:30
because it's, I guess,

00:02:32
not common knowledge.

00:02:34
Right.

00:02:34
Right.

00:02:35
We'll, get you

00:02:35
back in a second.

00:02:36
maybe you can extend

00:02:37
a little bit on CISPA.

00:02:39
CISPA is interesting

00:02:40
because Simplyblock

00:02:43
itself, well-

00:02:44
We're not part of CISPA,

00:02:46
but CISPA supports us

00:02:47
as well, due to the

00:02:50
encryption efforts we do

00:02:51
with the data storage.

00:02:52
But maybe say a

00:02:53
few more words

00:02:54
about CISPA itself.

00:02:56
Yes.

00:02:58
So CISPA is this

00:02:59
research center

00:03:01
here that's, that

00:03:02
has grown a lot.

00:03:03
So it's still

00:03:04
relatively new.

00:03:05
It was founded

00:03:06
six years ago.

00:03:08
And now we already

00:03:09
have 600, more than 600

00:03:11
people working here.

00:03:14
Among them, a lot

00:03:15
of scientists.

00:03:16
So we have now roughly

00:03:18
40 group leaders

00:03:20
here that have their

00:03:21
own research groups,

00:03:22
covering all topics

00:03:24
that are information

00:03:25
security related.

00:03:27
But also AI,

00:03:28
machine learning.

00:03:30
So we also in this area.

00:03:33
We are mainly

00:03:35
scientists.

00:03:36
We are mainly working

00:03:38
on academic problems.

00:03:40
We try to

00:03:40
solve problems.

00:03:42
We write them down.

00:03:42
We publish papers

00:03:44
at the top academic

00:03:45
conferences worldwide.

00:03:47
And with that, we are

00:03:48
also leading worldwide.

00:03:50
So there's no other

00:03:51
university or research

00:03:53
center internationally

00:03:54
that has more of

00:03:55
these top publications

00:03:57
worldwide on this,

00:03:58
at these conferences.

00:04:01
So we really try to

00:04:02
be the best in all

00:04:03
the things we do.

00:04:05
Sounds like

00:04:06
marketing now.

00:04:07
But

00:04:09
this is really

00:04:10
what we want to do.

00:04:11
We want to be the best.

00:04:13
And of course,

00:04:13
also train the

00:04:14
next generation.

00:04:16
So that means

00:04:17
getting new students.

00:04:19
And also as you've seen

00:04:21
yourself, we want to

00:04:22
support companies here.

00:04:24
We, we're helping

00:04:25
startups with an

00:04:27
security background and

00:04:30
well, we want to make

00:04:31
sure that more people

00:04:33
work in this area.

00:04:35
But ideally also in

00:04:37
Germany that we are

00:04:38
creating more shops

00:04:40
here about these topics

00:04:43
because we also believe

00:04:44
that these are the

00:04:45
topics that will stay

00:04:46
relevant in the future.

00:04:48
I agree.

00:04:50
You already hinted at

00:04:51
side channel attacks.

00:04:53
I'm a bit of a geek.

00:04:54
So I love to look

00:04:55
into all of that.

00:04:56
I'm probably not,

00:04:59
well, I don't want

00:05:00
to say technically

00:05:01
skilled enough, but

00:05:01
I'm probably not deep

00:05:02
enough into side channel

00:05:04
attacks to specifically

00:05:05
explain how it works.

00:05:07
But I think you're,

00:05:08
doing a much better job.

00:05:10
Maybe just give us an

00:05:11
explanation, especially

00:05:12
the audience, what is

00:05:13
specifically a side

00:05:14
channel attack and how

00:05:15
kind of does that work?

00:05:17
Yes.

00:05:18
Maybe let's start

00:05:19
with an intuition

00:05:20
from the real world.

00:05:22
So sometimes when you

00:05:24
live somewhere, you have

00:05:25
a neighbor and you don't

00:05:26
know if this neighbor is

00:05:28
at home or not at home

00:05:29
or on vacation or not.

00:05:31
And you're also

00:05:31
not talking to

00:05:32
this neighbor.

00:05:32
You might not

00:05:35
even know the neighbor,

00:05:37
but you can still

00:05:38
learn something by

00:05:39
observing side effects

00:05:41
of the behavior

00:05:42
of your neighbor.

00:05:43
For example, look

00:05:44
outside your window

00:05:46
and you see if

00:05:46
the lights are on.

00:05:48
And if the lights are

00:05:49
on, you have a certain

00:05:51
probability that the

00:05:52
neighbor is at home.

00:05:54
I mean, you're not 100

00:05:55
percent sure, right?

00:05:56
sometimes you forget

00:05:57
to turn off the lights.

00:05:59
Then your

00:06:00
guess is wrong.

00:06:01
Sometimes you are at

00:06:02
home and don't have the

00:06:03
lights turned on because

00:06:05
maybe you're watching

00:06:05
TV or something.

00:06:07
But still you get a

00:06:08
good chance, if you

00:06:09
see lights, you can

00:06:10
assume that probably

00:06:12
this person is at home.

00:06:13
If the lights are

00:06:14
off, probably not.

00:06:15
If lights haven't

00:06:17
been on for a week,

00:06:18
probably the neighbor

00:06:19
is on vacation or died.

00:06:21
Hopefully not.

00:06:24
So you learn

00:06:25
certain things.

00:06:26
But just observing

00:06:27
things, not directly

00:06:28
about the neighbor, but

00:06:30
what is influenced by

00:06:31
the neighbor's behavior.

00:06:34
And in real worlds,

00:06:36
we have many such

00:06:37
scenarios where we

00:06:38
just see something and

00:06:40
then try to infer what

00:06:41
is really happening.

00:06:43
In computer science, we

00:06:44
try to do the same on

00:06:46
the software level and

00:06:47
on the hardware level.

00:06:49
So here, what we

00:06:50
do in our research,

00:06:51
we're not observing

00:06:52
neighbors directly,

00:06:54
but it's also not so

00:06:55
far away, if you're

00:06:56
talking about the cloud.

00:06:58
We also have neighbors

00:06:59
on the cloud.

00:07:00
We want to see what is

00:07:02
this neighbor on the

00:07:04
cloud, this other user

00:07:06
running on the same

00:07:07
server actually doing.

00:07:10
We can't directly

00:07:11
talk to them.

00:07:12
We can't see what

00:07:13
they're doing.

00:07:14
But we see certain

00:07:15
side effects of that.

00:07:17
So also intuitively

00:07:18
think about you run an

00:07:20
application that uses

00:07:21
a lot of resources,

00:07:23
that uses all your

00:07:24
CPU, all your memory.

00:07:26
Then the other

00:07:27
application maybe from a

00:07:28
different customer, see

00:07:31
some bottlenecks there,

00:07:32
seeing a slowdown in

00:07:33
their own application.

00:07:35
And from that, you can

00:07:36
already infer not much.

00:07:38
So this is like really

00:07:40
simple one, contention

00:07:42
based side channel

00:07:43
is what it's called.

00:07:44
Someone uses resources,

00:07:46
resources are not

00:07:47
endless, so you cannot

00:07:49
use them, and then

00:07:50
you already see that

00:07:51
something is happening.

00:07:52
The lovely noisy

00:07:53
neighbor problem

00:07:54
that everyone knows

00:07:55
from the cloud.

00:07:56
Yes.

00:07:57
And that often has a

00:07:58
performance problem,

00:07:59
but it's actually

00:08:00
a security issue.

00:08:01
If you can then start,

00:08:03
try to infer not

00:08:04
only that there is

00:08:05
a neighbor, but what

00:08:07
the neighbor is doing.

00:08:08
So if this noise that

00:08:10
you're seeing depends

00:08:12
on some secrets, then

00:08:14
it gets interesting.

00:08:15
And this is what we do.

00:08:16
We try to find such

00:08:17
noise patterns that

00:08:19
are unique to secrets.

00:08:22
For example, if you're

00:08:23
doing cryptography,

00:08:24
you have a secret key

00:08:25
involved that consists

00:08:26
of zero and one bits.

00:08:28
And depending on if

00:08:31
the bit in the key is

00:08:32
currently a zero or a

00:08:33
one, your CPU has to

00:08:35
do different things.

00:08:37
And that involves

00:08:38
different resources and

00:08:41
different computation.

00:08:42
And we can see that

00:08:43
in certain patterns.

00:08:45
We can see that in when

00:08:47
memory is accessed or

00:08:48
memory is not accessed.

00:08:50
When certain parts

00:08:51
of the CPU are

00:08:52
active or not active,

00:08:54
then we can't use

00:08:56
them, for example.

00:08:58
And we see that.

00:08:58
And then we can

00:08:59
infer like, okay.

00:09:00
Now there's a zero bit.

00:09:01
Now there's a one

00:09:02
bit, a one bit, a zero

00:09:03
bit, just by observing

00:09:05
some other effects in

00:09:06
our own applications.

00:09:08
And from that, inferring

00:09:09
an entire key, for

00:09:10
example, breaking the

00:09:12
crypto, even though it's

00:09:13
mathematically secure,

00:09:15
the implementation

00:09:16
is correct.

00:09:17
There are no

00:09:17
software bugs.

00:09:19
But the side effects

00:09:20
are observable.

00:09:22
And from that, we

00:09:22
can infer the key.

00:09:24
Often not 100

00:09:25
percent correct.

00:09:27
But even if you get,

00:09:28
let's say we have

00:09:28
an AS key, 128 bits.

00:09:32
We can get 120 bits

00:09:33
correct, guessing

00:09:35
the remaining ones,

00:09:36
that's doable.

00:09:37
Right, right.

00:09:38
You basically just brute

00:09:40
force the remaining bits

00:09:41
by trying the potential.

00:09:43
Exactly.

00:09:43
And then maybe you

00:09:44
have to try 1000

00:09:45
different keys.

00:09:47
But at some point

00:09:48
we'll get it correct.

00:09:50
Huh.

00:09:50
Okay.

00:09:51
That's slightly

00:09:52
different from what

00:09:52
I thought it is.

00:09:54
I knows, well, I don't

00:09:56
even know if the side

00:09:57
channel attack would be

00:09:58
correct name for that.

00:09:59
But basically when you

00:10:00
try to bring CPUs and

00:10:05
stuff into a hiccup by

00:10:07
giving them the wrong

00:10:08
signal at the wrong

00:10:08
time, at very specific

00:10:10
timings, and then you

00:10:12
just jump over certain

00:10:14
instructions or stuff.

00:10:15
Yes.

00:10:17
What you mean

00:10:17
that also exists.

00:10:18
These are the

00:10:19
hardware based side

00:10:20
channel attacks.

00:10:21
Okay.

00:10:22
But it is a side

00:10:22
channel attack.

00:10:23
Okay.

00:10:23
Yeah.

00:10:23
I wasn't-

00:10:24
Now I wasn't a

00:10:25
hundred percent

00:10:25
sure if I'm actually

00:10:26
correct about that.

00:10:28
Yes, they are also

00:10:29
considered side

00:10:30
channel attacks.

00:10:31
Even though nowadays we

00:10:32
mostly call them fault

00:10:33
attacks because you

00:10:35
think of something and

00:10:36
then you induce a fault

00:10:39
in the CPU, it skips

00:10:40
an instruction, it does

00:10:41
a wrong calculation,

00:10:43
stuff like that.

00:10:44
Right, fault injection.

00:10:45
That was the term

00:10:46
I was looking for.

00:10:47
Right.

00:10:47
Right.

00:10:48
Cool.

00:10:49
Yeah.

00:10:49
I love the example

00:10:51
with the neighbor

00:10:52
because that's just

00:10:53
incredibly visual for

00:10:55
everyone to understand.

00:10:58
But your team, you

00:10:59
worked on something

00:11:00
very specific, which

00:11:01
was the CacheWarp.

00:11:02
And I think I was

00:11:03
like, released two

00:11:04
years ago, a year and

00:11:06
a half ago, something?

00:11:08
I think it was last

00:11:09
year, November.

00:11:10
Or maybe last

00:11:11
year in November.

00:11:12
Maybe just say a few

00:11:13
words about that.

00:11:14
I think I read

00:11:15
about that on Haiza.

00:11:16
I was actually

00:11:17
not aware you guys

00:11:18
are behind that.

00:11:19
So it was really

00:11:20
interesting when

00:11:21
I got the chance

00:11:22
to talk to you.

00:11:25
Yes, so this is a

00:11:27
really nice attack

00:11:28
and it shows something

00:11:29
very interesting.

00:11:31
We have CPUs for a

00:11:32
long time and we're

00:11:34
adding features on

00:11:36
top, on top, on top.

00:11:38
And sometimes you're

00:11:39
forgetting what we

00:11:40
already added back

00:11:41
then, let's say

00:11:42
in the eighties.

00:11:43
And it's still, there's

00:11:44
some legacy stuff

00:11:46
nobody dares to touch.

00:11:48
And we are adding

00:11:50
new features,

00:11:50
forgetting about

00:11:51
the legacy features

00:11:52
and also forgetting

00:11:53
to think about how

00:11:54
they could interact.

00:11:55
And CacheWarp

00:11:57
is a really nice

00:11:58
example of that.

00:12:00
So this targets

00:12:00
the newest AMT

00:12:02
CPUs of the Trusted

00:12:03
Execution Environment

00:12:04
used in the cloud.

00:12:06
This SEV.

00:12:08
And this SEV has the

00:12:10
security guarantees

00:12:11
saying like everything

00:12:13
you run in there is

00:12:14
secure, even if your

00:12:16
cloud provider is

00:12:18
malicious or doesn't

00:12:19
have to be malicious,

00:12:20
but could be hacked.

00:12:23
Even with the

00:12:23
permission of the

00:12:25
cloud providers, you

00:12:25
have no way of seeing

00:12:27
what is running inside

00:12:28
the virtual machine.

00:12:29
The way it works is

00:12:30
that it encrypts every

00:12:33
virtual machine with

00:12:34
a specific hardware,

00:12:35
well, with the AES

00:12:37
key, I think, right?

00:12:38
Yes, exactly.

00:12:39
Yeah.

00:12:39
So, it encrypts.

00:12:40
It also attests that

00:12:42
it's running on actual

00:12:44
read hardware and

00:12:45
it's not emulated in

00:12:46
some way and gives

00:12:48
you, in theory, pretty

00:12:49
good guarantees that

00:12:52
everything you run there

00:12:53
cannot be modified,

00:12:54
cannot be seen.

00:12:56
And this looked fine,

00:12:59
but then we started

00:13:00
looking at this

00:13:02
interaction, like,

00:13:02
okay, we built this

00:13:03
super really cool stuff.

00:13:05
We have that in

00:13:06
the CPUs now, and

00:13:06
it's pretty new.

00:13:08
And on the other side,

00:13:09
we have instructions

00:13:10
in the CPU that were

00:13:12
introduced in the

00:13:13
beginning of x86.

00:13:16
And nobody uses them

00:13:17
anymore because we

00:13:18
don't have use cases for

00:13:19
them anymore in modern

00:13:20
operating systems.

00:13:22
And then you even read

00:13:23
the manuals about this

00:13:24
instruction scene.

00:13:25
Like what do they do

00:13:27
with the instructions?

00:13:28
They still exist.

00:13:29
That's the nice

00:13:30
thing about x86 being

00:13:32
backward compatible

00:13:33
all the way back.

00:13:35
And while I'm wondering,

00:13:36
like, what happens

00:13:37
if you use these

00:13:38
instructions that

00:13:39
are useless nowadays?

00:13:40
And the manual was

00:13:41
like, don't use them.

00:13:43
I'm like- that's,

00:13:46
interesting.

00:13:47
So it doesn't say

00:13:48
anything about it would

00:13:50
prevent us from using

00:13:51
them, which is like, if

00:13:53
you use modern features

00:13:54
like multi core, do not

00:13:56
use them because they

00:13:58
don't work as expected.

00:14:01
I see where

00:14:01
this is going.

00:14:03
Let's see.

00:14:04
I mean, if somebody

00:14:04
tells me to not

00:14:05
do something,

00:14:08
my first thing is like,

00:14:11
now I'm even, now I'm

00:14:11
doing it even harder.

00:14:13
So let's see what

00:14:14
is happening.

00:14:15
And that's exactly

00:14:17
what we tried.

00:14:18
And my student was

00:14:21
the main driving force

00:14:22
behind that, Ray.

00:14:24
I told him like,

00:14:25
look, this could be

00:14:26
very interesting.

00:14:28
See what happens.

00:14:29
And at first, first try,

00:14:32
I said, okay, we just

00:14:33
run this instruction.

00:14:34
Everything crashed.

00:14:36
It's like, oh,

00:14:38
that is not a real

00:14:40
problem yet, but

00:14:42
definitely interesting.

00:14:44
I don't know if cloud

00:14:45
providers would agree

00:14:46
with that sentiment.

00:14:48
Yeah, well, I mean,

00:14:50
if you're the cloud

00:14:50
provider making sure

00:14:52
something is not working

00:14:53
anymore, that's easy.

00:14:54
You could also

00:14:54
take a hammer.

00:14:55
Okay, fair.

00:14:58
Well, it's like, that's

00:15:00
a starting point.

00:15:00
That's pretty

00:15:01
interesting.

00:15:02
And then we're starting

00:15:04
in investigating

00:15:05
that, making theories,

00:15:06
what could happen.

00:15:07
So talking technical

00:15:09
detail, what does

00:15:10
this instruction do?

00:15:13
You have the DRAM

00:15:14
where you store all

00:15:15
your memory, and then

00:15:16
you have the cache

00:15:16
inside the CPU that

00:15:18
stores recently used

00:15:19
copies of the data

00:15:21
you have in memory to

00:15:22
make things faster.

00:15:23
And also if you

00:15:24
modify data in your

00:15:26
applications, they first

00:15:27
are modified just in

00:15:28
the cache and, let's say

00:15:30
if you have time, they

00:15:31
are written back to the

00:15:32
real big main memory.

00:15:35
And what this

00:15:36
instruction does,

00:15:37
it clears the cache.

00:15:40
And it was like,

00:15:41
yeah, it's a legend

00:15:42
use case, for some

00:15:45
reasons when booting

00:15:47
a server, for example,

00:15:48
that you set everything

00:15:49
in an own state.

00:15:52
But what if you're

00:15:53
really running

00:15:53
now machines,

00:15:54
they modify data.

00:15:56
They only do that

00:15:56
inside the cache first,

00:15:58
and then we get rid

00:15:59
of all the content.

00:16:01
Is the modification

00:16:02
also lost?

00:16:03
And the short

00:16:04
answer, yes.

00:16:06
So for any virtual

00:16:08
machine, if this virtual

00:16:09
machine modifies some

00:16:10
data somewhere, updating

00:16:12
something, we can run

00:16:14
this instruction and

00:16:15
then this modification

00:16:16
is reverted.

00:16:17
So we can rollback to an

00:16:18
old state of the data.

00:16:20
And you can run that

00:16:21
from any thread, even

00:16:24
from a different core?

00:16:25
How does that work?

00:16:27
You can do that

00:16:29
for basically

00:16:30
everything, yes.

00:16:31
Wow.

00:16:32
So this instruction was

00:16:34
designed at a time when

00:16:35
there was only one core.

00:16:37
Right, yeah.

00:16:37
And it also says in

00:16:38
the manual, if you

00:16:39
have more than one

00:16:40
core, it's undefined

00:16:42
what will happen.

00:16:45
My favorite behavior.

00:16:47
Yes, don't do it if you

00:16:49
have more than one core.

00:16:50
Okay.

00:16:53
And, but, even if it's

00:16:55
limited to one core as

00:16:57
a cloud provider, you

00:16:58
can easily schedule the

00:17:00
VM you want to attack

00:17:01
to this core and then

00:17:02
do it on this core.

00:17:06
And this is what

00:17:07
CacheWarp essentially

00:17:09
can do is reverting

00:17:10
modifications.

00:17:11
So you modify something

00:17:13
and as an attacker, you

00:17:14
go back to the old data.

00:17:16
And this is

00:17:17
something that does

00:17:18
not sound powerful

00:17:19
at all at first.

00:17:21
Because like, yeah,

00:17:22
well, the data was

00:17:23
also there before.

00:17:25
No harm done.

00:17:26
And the interesting

00:17:27
thing is you do

00:17:28
that selectively, so

00:17:30
you're not reverting

00:17:31
everything, so you're

00:17:32
not like restoring

00:17:33
a snapshot of your

00:17:34
virtual machine, but

00:17:36
only of partial parts

00:17:38
of the memory that you

00:17:39
can directly target.

00:17:41
And then you get

00:17:43
really nice effects

00:17:43
based on also how

00:17:45
we write programs.

00:17:47
So if you're thinking

00:17:48
about programs,

00:17:51
for example, the

00:17:52
pseudobinary,

00:17:53
which elevates your

00:17:54
privileges to root

00:17:55
for an operation,

00:17:58
how does it do that?

00:17:59
Well, it asks the

00:18:00
operating system,

00:18:01
am I already root?

00:18:03
If so, it continues

00:18:04
with root.

00:18:05
Otherwise it asks for a

00:18:06
password, for example.

00:18:08
How is that implemented?

00:18:11
The root user has

00:18:13
this idea of zero.

00:18:15
When we write

00:18:15
programs, stuff is

00:18:17
typically initialized

00:18:19
at zero first.

00:18:20
So the permission is

00:18:23
zero because that's

00:18:24
how we start with

00:18:25
variables in memory.

00:18:27
Then the sudo program

00:18:28
asks the operating

00:18:29
system, like, Who am I?

00:18:31
The operating system

00:18:32
gives back the number

00:18:33
zero for root, another

00:18:36
number for any other

00:18:37
user, updates this

00:18:39
value in memory, use

00:18:41
CacheWarp, revert

00:18:42
it to zero, and

00:18:44
then we are root,

00:18:45
because it was also

00:18:46
initialized like that.

00:18:47
And this is not only

00:18:49
like a specific case

00:18:50
of sudo, this is many

00:18:52
cases how we write

00:18:53
programs, we also,

00:18:55
for error checking,

00:18:56
we started like,

00:18:57
we had no error.

00:18:59
And then we check

00:19:00
certain things and if

00:19:01
we have an error, then

00:19:02
we update that like

00:19:03
we had an error, but

00:19:04
we can revert that.

00:19:06
And even if there was

00:19:07
an error, there's no

00:19:09
trace of that anymore

00:19:10
and we continue.

00:19:12
Right, right.

00:19:13
The second you said it,

00:19:15
it asked the operating

00:19:16
system, what user am I?

00:19:18
I was like, Oh yeah.

00:19:19
Okay.

00:19:19
ID zero.

00:19:20
I see.

00:19:23
That is good because

00:19:24
I, to be honest, I, saw

00:19:25
the CacheWarp exploit.

00:19:28
I did not really see how

00:19:30
this was used and now

00:19:32
it makes total sense.

00:19:33
Yeah.

00:19:34
Because I was also

00:19:35
like, how does it-

00:19:37
my understanding was

00:19:39
that you can actually

00:19:40
reset it to a different

00:19:41
previous state and I

00:19:42
never understood how

00:19:43
that worked, but you

00:19:44
basically just clear it

00:19:45
out and it's all zero.

00:19:47
And if you want

00:19:48
something to be zero,

00:19:51
that's the way to go.

00:19:53
Yes.

00:19:53
Yes.

00:19:53
Right.

00:19:54
Right.

00:19:58
Now you got me.

00:20:01
Wow.

00:20:02
I,

00:20:04
that is just brilliant.

00:20:07
I have so many use

00:20:07
cases for that now.

00:20:09
Thanks.

00:20:10
Of course, a lot of

00:20:11
the brilliance is then

00:20:13
still needed for finding

00:20:14
targets, but like,

00:20:15
okay, I can reset that

00:20:17
back to this value.

00:20:18
Where exactly do I

00:20:19
do that in a program

00:20:20
that it gives me

00:20:21
exactly what I want?

00:20:23
But we showed

00:20:24
quite a few cases

00:20:25
where that works.

00:20:25
You only have to time

00:20:27
it correctly, right?

00:20:28
So you have to figure

00:20:29
out when is the correct

00:20:31
point in time when

00:20:33
sudo would actually

00:20:34
ask like, Hey, give

00:20:35
me this operation

00:20:36
or give me this ID.

00:20:37
Yes.

00:20:38
Yes.

00:20:39
So that means you

00:20:41
basically create,

00:20:43
well, not a remote code

00:20:45
execution per se, but

00:20:48
you could probably make

00:20:49
that happen as well.

00:20:51
But you gain,

00:20:55
well, privilege

00:20:56
escalation, basically.

00:20:57
You gain root

00:20:58
permission.

00:20:59
So the next step would

00:21:00
be to find something

00:21:01
else to inject it

00:21:02
into a memory and say,

00:21:03
please execute that.

00:21:04
And I guess with Cache,

00:21:07
you could probably do

00:21:08
the same thing by just

00:21:10
making sure you're

00:21:11
redirecting to the right

00:21:12
memory location now.

00:21:15
Yes.

00:21:15
So, this is one thing

00:21:16
you can work on a very

00:21:18
low level and also the

00:21:19
CPU remembers certain

00:21:21
things, like when you

00:21:23
go to some different

00:21:24
part of the code, how

00:21:25
to return, you can reset

00:21:27
that and you return

00:21:27
to some other place.

00:21:30
Makes it quite nice.

00:21:32
But we also showed

00:21:32
like this full

00:21:33
chain end to end

00:21:34
exploit in two steps.

00:21:37
So you have some way

00:21:39
to log into a server.

00:21:40
SSH, typically.

00:21:42
There's a

00:21:43
password check.

00:21:44
And with CacheWarp,

00:21:47
we trick that password

00:21:48
check into believing

00:21:49
that it does not

00:21:50
matter what we enter.

00:21:51
Any password is correct.

00:21:53
Then we are logged

00:21:54
in as a normal user.

00:21:56
Then you use it

00:21:56
again on sudo.

00:21:58
And then we are logged

00:21:59
into a server, into

00:22:00
any virtual machine

00:22:01
with root privileges.

00:22:03
Then we can just

00:22:05
execute whatever we

00:22:06
want there as root.

00:22:07
So full control

00:22:08
of the VM.

00:22:10
this entire thing takes

00:22:12
just a few seconds.

00:22:13
Wow.

00:22:14
And we had over 90

00:22:16
percent reliability.

00:22:19
And you basically

00:22:21
started with getting

00:22:23
a virtual machine on

00:22:24
some shared resource

00:22:25
and that's just how

00:22:27
you get into it.

00:22:28
It's hard to be precise,

00:22:29
I guess, to target

00:22:30
a specific company,

00:22:31
but you never know

00:22:32
who's alongside you.

00:22:34
Exactly.

00:22:35
Exactly.

00:22:37
Right.

00:22:37
I mean, we also don't

00:22:38
want to do that.

00:22:38
This is where the

00:22:40
academic part also ends.

00:22:42
We kind of already

00:22:43
overstepped it a bit

00:22:44
with showing like step

00:22:46
by step how to get

00:22:47
from this problem, this

00:22:50
logic problem in the

00:22:51
CPU to full, to taking

00:22:54
over an entire VM.

00:22:56
We are not going into

00:22:57
details like how to

00:22:59
attack a specific

00:23:00
company even though

00:23:00
a lot of my students

00:23:02
also ask these things

00:23:03
like, I now want to

00:23:04
break into Microsoft.

00:23:05
What do I do?

00:23:07
But this is not what

00:23:09
we should do And

00:23:09
also not what we do.

00:23:11
That makes sense.

00:23:12
It was more like

00:23:12
a rhetorical

00:23:13
question, obviously.

00:23:17
Wow.

00:23:18
That is quite something,

00:23:20
especially it comes

00:23:22
about two years after,

00:23:23
or two and a half

00:23:24
years after Spectre and

00:23:27
Meltdown, which were

00:23:28
the other big things.

00:23:30
And as you said,

00:23:31
CPUs become more

00:23:32
and more complex.

00:23:33
At that point in time,

00:23:34
it was the predictor,

00:23:38
making guesses

00:23:39
where to jump next.

00:23:40
And it was the

00:23:41
same thing.

00:23:42
We, you do those things

00:23:43
to speed up CPUs and

00:23:45
to make them better.

00:23:46
But now you have all

00:23:49
those features and

00:23:51
people all just figure

00:23:52
out how to use them.

00:23:54
I think people are

00:23:55
probably more familiar

00:23:56
with Meltdown and

00:23:57
Spectre because it

00:23:57
was like, if you want

00:24:00
to say it was a big

00:24:03
F up, because also it

00:24:05
involved AMD, Intel, and

00:24:07
even ARM CPUs, as far

00:24:09
as I remember, right?

00:24:10
ARM was a little bit

00:24:11
late to the game, but

00:24:11
people figured it out

00:24:12
how to do it as well.

00:24:14
And the meltdown

00:24:16
thing was like three

00:24:17
different CVs or

00:24:19
four, even, different-

00:24:21
In the beginning, one

00:24:23
and two for Spectre.

00:24:25
Or two for, yeah.

00:24:26
It was a couple

00:24:27
of iterations.

00:24:29
From your perspective,

00:24:30
which one is worse?

00:24:32
It's a very difficult

00:24:33
question, but also

00:24:35
a really interesting

00:24:36
one to think about.

00:24:37
So when we-

00:24:40
So Meltdown Spectre, we

00:24:41
published that in 2018,

00:24:43
beginning of 2018, we

00:24:45
discovered it in 2017.

00:24:47
Back then, we did not

00:24:49
fully understand the

00:24:50
consequences of that.

00:24:53
Now in hindsight,

00:24:54
I would say, It

00:24:56
really depends.

00:24:57
So Meltdown is something

00:25:00
that has a huge impact

00:25:02
or had a huge impact,

00:25:03
but luckily, like the

00:25:04
year 2k bug, nobody

00:25:06
saw the impact, because

00:25:09
that was disclosed in

00:25:10
June, 2017 and made

00:25:13
public in January, 2018.

00:25:16
And so all the vendors

00:25:18
had time to work on

00:25:20
fixes, workarounds.

00:25:22
And when it was

00:25:23
public, we had already

00:25:26
systems protected

00:25:27
against exploitation.

00:25:28
so the issue were not

00:25:29
fixed, but at least

00:25:31
exploitation was made

00:25:33
difficult to impossible,

00:25:34
depending on the system.

00:25:35
So we did not

00:25:36
see the impact.

00:25:37
If that hadn't happened,

00:25:40
that would have been

00:25:40
a huge impact because

00:25:42
the code mounting that

00:25:43
is so extremely easy.

00:25:45
Back then, our group was

00:25:48
part of the discovery.

00:25:49
We printed t shirts.

00:25:51
And we could fit

00:25:52
the entire exploit

00:25:52
code on a t shirt.

00:25:54
I remember that t shirt.

00:25:56
But that was at least

00:25:58
easy to mitigate.

00:26:00
Spectre on the other

00:26:00
side is really difficult

00:26:02
and we still have that.

00:26:03
And we still have

00:26:04
to deal with that.

00:26:05
We still don't have

00:26:06
real solutions.

00:26:07
but it's also way

00:26:08
harder to exploit

00:26:10
for an attacker.

00:26:11
That's similar to

00:26:12
stuff you have on

00:26:13
software security.

00:26:15
Let's say easier things

00:26:17
like buffer overflows.

00:26:18
We know them since

00:26:19
the eighties.

00:26:20
We know how to

00:26:21
exploit them.

00:26:22
It gets harder and

00:26:23
harder to exploit.

00:26:24
We still have the bugs.

00:26:26
We can't fundamentally

00:26:27
fix them.

00:26:28
We could, but we don't.

00:26:31
But also, we

00:26:32
live with that.

00:26:33
And similar with

00:26:33
Spectre, we don't

00:26:34
know how to fix that.

00:26:36
But as it's so difficult

00:26:37
to exploit, we kind

00:26:38
of live with that.

00:26:40
Meltdown would be easy.

00:26:41
So we had to do

00:26:42
something immediately

00:26:43
and luckily also found

00:26:45
ways for fixing that.

00:26:47
CacheWarp goes in

00:26:48
a similar direction

00:26:49
as Meltdown.

00:26:51
It's very easy

00:26:52
to exploit.

00:26:53
It's extremely powerful,

00:26:55
but luckily also AMD

00:26:57
was able to fix that

00:26:59
very quickly, Let's

00:27:01
say within half a year.

00:27:02
But it's not the

00:27:05
greatest fix, but

00:27:06
by removing other

00:27:07
functionality, nowadays

00:27:10
we are used to that,

00:27:11
that our CPUs are

00:27:12
losing functions with

00:27:13
security updates.

00:27:16
At least we have

00:27:16
a workaround.

00:27:17
it's not

00:27:17
exploitable anymore.

00:27:19
I hope cloud providers

00:27:22
also deploy that.

00:27:23
we cannot check.

00:27:25
But at least there

00:27:27
is something where we

00:27:28
could ensure nobody

00:27:29
can exploit it anymore.

00:27:31
So I guess it's a

00:27:32
microcode update and

00:27:33
you basically just load

00:27:34
that, through the Linux

00:27:37
kernel or whatever.

00:27:38
Yeah, that makes sense.

00:27:40
So you think Spectre is

00:27:42
the worst one because

00:27:44
it's hard to fix?

00:27:46
Because this is a

00:27:48
design issue, it's

00:27:50
inherent to the design

00:27:52
versus the others

00:27:53
are implementation

00:27:54
issues where somebody

00:27:56
made a mistake

00:27:56
implementing something.

00:27:58
Right.

00:27:58
Okay.

00:27:58
That makes sense.

00:28:00
Yeah, we're almost-

00:28:02
Well, we are

00:28:03
out of time.

00:28:05
But one last question.

00:28:07
The one that I've

00:28:08
always asked, like,

00:28:09
what do you think is

00:28:10
the next big thing?

00:28:11
Are you working on

00:28:11
something, CacheWarp 2?

00:28:16
Always and even worse.

00:28:18
Of course I can't give

00:28:20
any details on that.

00:28:23
But yes, we just started

00:28:25
to scratch the surface

00:28:27
of all these problems.

00:28:29
And this is also not

00:28:31
surprising if you

00:28:32
think about that.

00:28:33
We had software bugs

00:28:34
in for years, decades.

00:28:37
Nobody's surprised

00:28:38
if there's a bug

00:28:38
in a software.

00:28:39
We need a patch.

00:28:41
Everybody is suddenly

00:28:42
surprised that we

00:28:42
have that in the CPUs.

00:28:44
But also hardware

00:28:45
is nowadays just

00:28:46
written in software.

00:28:48
We have hardware

00:28:49
description languages.

00:28:50
These are just

00:28:51
programming languages

00:28:52
for hardware.

00:28:54
We combine our CPU.

00:28:56
So program takes

00:28:57
care of taking our

00:28:58
description, making

00:28:59
that into hardware.

00:29:01
Simplified.

00:29:03
Of course it's also

00:29:04
written by humans

00:29:05
like software.

00:29:06
Humans make mistakes.

00:29:08
That's something

00:29:09
we will always see.

00:29:11
And so it's not a

00:29:12
surprise that we see

00:29:13
a lot of problems

00:29:14
in CPUs as well.

00:29:15
We will find more and

00:29:17
more over the years.

00:29:20
So far, we were

00:29:22
relatively lucky that

00:29:23
we could always add some

00:29:26
quick fix mitigation

00:29:28
that's disabled some

00:29:29
functionality maybe, but

00:29:30
prevented exploitation.

00:29:33
I hope it stays that

00:29:34
way, but I fear not.

00:29:37
And at some point

00:29:37
we will see these

00:29:38
big problems that

00:29:40
we cannot fix.

00:29:42
And this is really

00:29:43
bad because we cannot

00:29:45
simply do an update

00:29:46
like with software.

00:29:47
And then we have to,

00:29:49
well, think about

00:29:50
what to do, right?

00:29:51
And I guess the thing

00:29:52
goes for the ever

00:29:54
more complex Graphics

00:29:57
cards, especially

00:29:58
when you go into the

00:29:59
AI accelerators and

00:30:01
stuff like that, right?

00:30:02
Those things

00:30:04
grow, and get more

00:30:06
complex day by day.

00:30:09
Yes.

00:30:09
Yes.

00:30:10
We are currently mostly

00:30:12
adding complexity

00:30:14
and not trying to

00:30:14
simplify things.

00:30:16
Right.

00:30:17
Complexity

00:30:18
introduces problems.

00:30:21
That is very true.

00:30:22
and every software

00:30:23
developer, I

00:30:24
guess, knows that.

00:30:25
I was kind of shocked

00:30:27
how way I was off

00:30:28
with the Meltdown

00:30:30
and Spectre timing.

00:30:32
Seems like COVID

00:30:33
completely messed with

00:30:34
my feeling for time.

00:30:36
Yeah.

00:30:36
It also feels like

00:30:37
yesterday for me.

00:30:39
I can't imagine that.

00:30:40
Yeah.

00:30:41
Thank you very much.

00:30:42
It was a pleasure

00:30:42
having you.

00:30:43
I hope you, I have

00:30:44
the chance to have

00:30:45
you back somewhere in

00:30:46
the future, after the

00:30:48
next big exploit you

00:30:49
guys were working on.

00:30:51
Because that is

00:30:52
just like incredibly

00:30:53
interesting.

00:30:54
And I think for our

00:30:56
audience, which is

00:30:58
often cloud users, it's

00:30:59
also very relevant.

00:31:02
Thanks for having me.

00:31:03
It was a pleasure

00:31:04
talking to you

00:31:04
about that.

00:31:05
And yes, I'd be

00:31:06
happy to be back.

00:31:07
All right.

00:31:08
Thank you very much for

00:31:10
the audience as well.

00:31:11
Next week, same

00:31:12
time, same place.

00:31:14
And I hope you're

00:31:15
listening in again.

00:31:16
And thank you very much

00:31:17
for being here as well.

00:31:21
The Cloud Commute

00:31:22
Podcast is sponsored

00:31:23
by Simplyblock.

00:31:24
Your own elastic

00:31:25
block storage engine

00:31:25
for the cloud.

00:31:26
Get higher IOPS and

00:31:27
low predictable latency

00:31:29
while bringing down your

00:31:30
total cost of ownership.

00:31:31
www.simplyblock.io