In this week's episode we have a very special guest, Michael Schwarz from CISPA. A security researcher specialized in CPU side-channel attacks. He explains how side-channel attacks work in general, but most specifically with the example of his team's most recent find: #CacheWarp. He was also involved in the finding of #Meltdown and #Spectre.
In this episode of Cloud Commute, Chris and Michael discuss:
- Understanding side-channel attacks and their impact on cloud security
- The role of CISPA in cybersecurity research and startup support
- CacheWarp: A CPU vulnerability that enables privilege escalation
- The future of hardware vulnerabilities and increasing complexity in CPUs
Interested to learn more about the cloud infrastructure stack like storage, security, and Kubernetes? Head to our website (www.simplyblock.io/cloud-commute-podcast) for more episodes, and follow us on LinkedIn (www.linkedin.com/company/simplyblock-io). You can also check out the detailed show notes on Youtube (www.youtube.com/watch?v=fLUSrcah2xE).
You can find Michael Schwarz on X @misc0110 and Linkedin: /michael-schwarz-5aaab720.
About simplyblock:
Simplyblock is an intelligent database storage orchestrator for IO-intensive workloads in Kubernetes, including databases and analytics solutions. It uses smart NVMe caching to speed up read I/O latency and queries. A single system connects local NVMe disks, GP3 volumes, and S3 making it easier to handle storage capacity and performance. With the benefits of thin provisioning, storage tiering, and volume pooling, your database workloads get better performance at lower cost without changes to existing AWS infrastructure.
👉 Get started with simplyblock: https://www.simplyblock.io/buy-now
🏪 simplyblock AWS Marketplace: https://aws.amazon.com/marketplace/seller-profile?id=seller-fzdtuccq3edzm
00:00:00
But then we started
00:00:00
looking at this
00:00:01
interaction, like, okay,
00:00:02
we built this super,
00:00:03
really cool stuff.
00:00:05
We have that in
00:00:05
the CPUs now, and
00:00:06
it's pretty new.
00:00:07
And on the other side,
00:00:08
we have instructions
00:00:09
in the CPU that
00:00:10
were introduced
00:00:11
in the beginning
00:00:12
of x86 and nobody
00:00:13
uses them anymore.
00:00:14
CPUs become more
00:00:15
and more complex.
00:00:17
At that point in time,
00:00:17
it was the predictor
00:00:19
making guesses
00:00:20
where to jump next.
00:00:21
Look outside your
00:00:22
window and you see
00:00:23
if the lights are on.
00:00:24
And if the lights are
00:00:24
on, you have a certain
00:00:26
probability that the
00:00:27
neighbor is at home.
00:00:28
The lovely noisy
00:00:29
neighbor problem
00:00:30
that everyone knows
00:00:31
from the cloud.
00:00:32
I love the example
00:00:33
with the neighbor.
00:00:33
That's just incredibly
00:00:35
visual for everyone
00:00:37
to understand.
00:00:37
So if this noise that
00:00:39
you're seeing depends
00:00:40
on some secrets, then
00:00:42
it gets interesting.
00:00:43
And this is what we do.
00:00:46
You're listening to
00:00:47
Simplyblock's Cloud
00:00:47
Commute Podcast,
00:00:49
your weekly 20
00:00:49
minute podcast about
00:00:51
cloud technologies,
00:00:52
Kubernetes, security,
00:00:53
sustainability,
00:00:54
and more.
00:00:57
Hello, everyone.
00:00:58
Welcome back to this
00:00:59
week's episode of
00:01:00
Simplyblock's Cloud
00:01:01
Community Podcast.
00:01:01
And as you know,
00:01:03
I have another
00:01:03
incredible guest.
00:01:04
I know I say this every
00:01:05
single time and it's
00:01:06
every single time true.
00:01:08
So, with me this week,
00:01:13
is Michael Schwarz,
00:01:16
fellow German.
00:01:18
Actually a security
00:01:19
researcher, with CISPA,
00:01:21
but he's gonna say a
00:01:23
few words about that.
00:01:24
So, welcome.
00:01:26
Welcome, Michael.
00:01:28
And maybe just introduce
00:01:29
yourself very quick.
00:01:31
Thanks for this
00:01:32
nice intro here.
00:01:34
Yes.
00:01:35
So as you already
00:01:36
said, I'm a
00:01:37
security researcher.
00:01:38
I'm working here
00:01:39
at CISPA, which
00:01:40
is a big research
00:01:42
center in Germany.
00:01:44
if you look at the
00:01:45
academic rankings,
00:01:47
one of the worldwide
00:01:48
leading institutes
00:01:49
for cybersecurity.
00:01:51
In my position
00:01:52
here, I'm leading
00:01:53
a research group.
00:01:54
I'm a faculty here.
00:01:55
So currently I have
00:01:57
six PhD students
00:01:59
that I advise and
00:02:01
additionally five
00:02:02
student helpers that
00:02:04
support me in the tasks.
00:02:05
We are a bit bigger,
00:02:09
not the biggest group,
00:02:10
but still a considerable
00:02:12
large group.
00:02:14
And we are working
00:02:16
on very specific
00:02:17
topics, I have to say.
00:02:18
It's also one of
00:02:19
the things I'm
00:02:19
talking about today.
00:02:21
We are working on kind
00:02:23
of all things related
00:02:25
to side channel attacks.
00:02:26
It's also a term that
00:02:28
we probably have to
00:02:29
introduce in the podcast
00:02:30
because it's, I guess,
00:02:32
not common knowledge.
00:02:34
Right.
00:02:34
Right.
00:02:35
We'll, get you
00:02:35
back in a second.
00:02:36
maybe you can extend
00:02:37
a little bit on CISPA.
00:02:39
CISPA is interesting
00:02:40
because Simplyblock
00:02:43
itself, well-
00:02:44
We're not part of CISPA,
00:02:46
but CISPA supports us
00:02:47
as well, due to the
00:02:50
encryption efforts we do
00:02:51
with the data storage.
00:02:52
But maybe say a
00:02:53
few more words
00:02:54
about CISPA itself.
00:02:56
Yes.
00:02:58
So CISPA is this
00:02:59
research center
00:03:01
here that's, that
00:03:02
has grown a lot.
00:03:03
So it's still
00:03:04
relatively new.
00:03:05
It was founded
00:03:06
six years ago.
00:03:08
And now we already
00:03:09
have 600, more than 600
00:03:11
people working here.
00:03:14
Among them, a lot
00:03:15
of scientists.
00:03:16
So we have now roughly
00:03:18
40 group leaders
00:03:20
here that have their
00:03:21
own research groups,
00:03:22
covering all topics
00:03:24
that are information
00:03:25
security related.
00:03:27
But also AI,
00:03:28
machine learning.
00:03:30
So we also in this area.
00:03:33
We are mainly
00:03:35
scientists.
00:03:36
We are mainly working
00:03:38
on academic problems.
00:03:40
We try to
00:03:40
solve problems.
00:03:42
We write them down.
00:03:42
We publish papers
00:03:44
at the top academic
00:03:45
conferences worldwide.
00:03:47
And with that, we are
00:03:48
also leading worldwide.
00:03:50
So there's no other
00:03:51
university or research
00:03:53
center internationally
00:03:54
that has more of
00:03:55
these top publications
00:03:57
worldwide on this,
00:03:58
at these conferences.
00:04:01
So we really try to
00:04:02
be the best in all
00:04:03
the things we do.
00:04:05
Sounds like
00:04:06
marketing now.
00:04:07
But
00:04:09
this is really
00:04:10
what we want to do.
00:04:11
We want to be the best.
00:04:13
And of course,
00:04:13
also train the
00:04:14
next generation.
00:04:16
So that means
00:04:17
getting new students.
00:04:19
And also as you've seen
00:04:21
yourself, we want to
00:04:22
support companies here.
00:04:24
We, we're helping
00:04:25
startups with an
00:04:27
security background and
00:04:30
well, we want to make
00:04:31
sure that more people
00:04:33
work in this area.
00:04:35
But ideally also in
00:04:37
Germany that we are
00:04:38
creating more shops
00:04:40
here about these topics
00:04:43
because we also believe
00:04:44
that these are the
00:04:45
topics that will stay
00:04:46
relevant in the future.
00:04:48
I agree.
00:04:50
You already hinted at
00:04:51
side channel attacks.
00:04:53
I'm a bit of a geek.
00:04:54
So I love to look
00:04:55
into all of that.
00:04:56
I'm probably not,
00:04:59
well, I don't want
00:05:00
to say technically
00:05:01
skilled enough, but
00:05:01
I'm probably not deep
00:05:02
enough into side channel
00:05:04
attacks to specifically
00:05:05
explain how it works.
00:05:07
But I think you're,
00:05:08
doing a much better job.
00:05:10
Maybe just give us an
00:05:11
explanation, especially
00:05:12
the audience, what is
00:05:13
specifically a side
00:05:14
channel attack and how
00:05:15
kind of does that work?
00:05:17
Yes.
00:05:18
Maybe let's start
00:05:19
with an intuition
00:05:20
from the real world.
00:05:22
So sometimes when you
00:05:24
live somewhere, you have
00:05:25
a neighbor and you don't
00:05:26
know if this neighbor is
00:05:28
at home or not at home
00:05:29
or on vacation or not.
00:05:31
And you're also
00:05:31
not talking to
00:05:32
this neighbor.
00:05:32
You might not
00:05:35
even know the neighbor,
00:05:37
but you can still
00:05:38
learn something by
00:05:39
observing side effects
00:05:41
of the behavior
00:05:42
of your neighbor.
00:05:43
For example, look
00:05:44
outside your window
00:05:46
and you see if
00:05:46
the lights are on.
00:05:48
And if the lights are
00:05:49
on, you have a certain
00:05:51
probability that the
00:05:52
neighbor is at home.
00:05:54
I mean, you're not 100
00:05:55
percent sure, right?
00:05:56
sometimes you forget
00:05:57
to turn off the lights.
00:05:59
Then your
00:06:00
guess is wrong.
00:06:01
Sometimes you are at
00:06:02
home and don't have the
00:06:03
lights turned on because
00:06:05
maybe you're watching
00:06:05
TV or something.
00:06:07
But still you get a
00:06:08
good chance, if you
00:06:09
see lights, you can
00:06:10
assume that probably
00:06:12
this person is at home.
00:06:13
If the lights are
00:06:14
off, probably not.
00:06:15
If lights haven't
00:06:17
been on for a week,
00:06:18
probably the neighbor
00:06:19
is on vacation or died.
00:06:21
Hopefully not.
00:06:24
So you learn
00:06:25
certain things.
00:06:26
But just observing
00:06:27
things, not directly
00:06:28
about the neighbor, but
00:06:30
what is influenced by
00:06:31
the neighbor's behavior.
00:06:34
And in real worlds,
00:06:36
we have many such
00:06:37
scenarios where we
00:06:38
just see something and
00:06:40
then try to infer what
00:06:41
is really happening.
00:06:43
In computer science, we
00:06:44
try to do the same on
00:06:46
the software level and
00:06:47
on the hardware level.
00:06:49
So here, what we
00:06:50
do in our research,
00:06:51
we're not observing
00:06:52
neighbors directly,
00:06:54
but it's also not so
00:06:55
far away, if you're
00:06:56
talking about the cloud.
00:06:58
We also have neighbors
00:06:59
on the cloud.
00:07:00
We want to see what is
00:07:02
this neighbor on the
00:07:04
cloud, this other user
00:07:06
running on the same
00:07:07
server actually doing.
00:07:10
We can't directly
00:07:11
talk to them.
00:07:12
We can't see what
00:07:13
they're doing.
00:07:14
But we see certain
00:07:15
side effects of that.
00:07:17
So also intuitively
00:07:18
think about you run an
00:07:20
application that uses
00:07:21
a lot of resources,
00:07:23
that uses all your
00:07:24
CPU, all your memory.
00:07:26
Then the other
00:07:27
application maybe from a
00:07:28
different customer, see
00:07:31
some bottlenecks there,
00:07:32
seeing a slowdown in
00:07:33
their own application.
00:07:35
And from that, you can
00:07:36
already infer not much.
00:07:38
So this is like really
00:07:40
simple one, contention
00:07:42
based side channel
00:07:43
is what it's called.
00:07:44
Someone uses resources,
00:07:46
resources are not
00:07:47
endless, so you cannot
00:07:49
use them, and then
00:07:50
you already see that
00:07:51
something is happening.
00:07:52
The lovely noisy
00:07:53
neighbor problem
00:07:54
that everyone knows
00:07:55
from the cloud.
00:07:56
Yes.
00:07:57
And that often has a
00:07:58
performance problem,
00:07:59
but it's actually
00:08:00
a security issue.
00:08:01
If you can then start,
00:08:03
try to infer not
00:08:04
only that there is
00:08:05
a neighbor, but what
00:08:07
the neighbor is doing.
00:08:08
So if this noise that
00:08:10
you're seeing depends
00:08:12
on some secrets, then
00:08:14
it gets interesting.
00:08:15
And this is what we do.
00:08:16
We try to find such
00:08:17
noise patterns that
00:08:19
are unique to secrets.
00:08:22
For example, if you're
00:08:23
doing cryptography,
00:08:24
you have a secret key
00:08:25
involved that consists
00:08:26
of zero and one bits.
00:08:28
And depending on if
00:08:31
the bit in the key is
00:08:32
currently a zero or a
00:08:33
one, your CPU has to
00:08:35
do different things.
00:08:37
And that involves
00:08:38
different resources and
00:08:41
different computation.
00:08:42
And we can see that
00:08:43
in certain patterns.
00:08:45
We can see that in when
00:08:47
memory is accessed or
00:08:48
memory is not accessed.
00:08:50
When certain parts
00:08:51
of the CPU are
00:08:52
active or not active,
00:08:54
then we can't use
00:08:56
them, for example.
00:08:58
And we see that.
00:08:58
And then we can
00:08:59
infer like, okay.
00:09:00
Now there's a zero bit.
00:09:01
Now there's a one
00:09:02
bit, a one bit, a zero
00:09:03
bit, just by observing
00:09:05
some other effects in
00:09:06
our own applications.
00:09:08
And from that, inferring
00:09:09
an entire key, for
00:09:10
example, breaking the
00:09:12
crypto, even though it's
00:09:13
mathematically secure,
00:09:15
the implementation
00:09:16
is correct.
00:09:17
There are no
00:09:17
software bugs.
00:09:19
But the side effects
00:09:20
are observable.
00:09:22
And from that, we
00:09:22
can infer the key.
00:09:24
Often not 100
00:09:25
percent correct.
00:09:27
But even if you get,
00:09:28
let's say we have
00:09:28
an AS key, 128 bits.
00:09:32
We can get 120 bits
00:09:33
correct, guessing
00:09:35
the remaining ones,
00:09:36
that's doable.
00:09:37
Right, right.
00:09:38
You basically just brute
00:09:40
force the remaining bits
00:09:41
by trying the potential.
00:09:43
Exactly.
00:09:43
And then maybe you
00:09:44
have to try 1000
00:09:45
different keys.
00:09:47
But at some point
00:09:48
we'll get it correct.
00:09:50
Huh.
00:09:50
Okay.
00:09:51
That's slightly
00:09:52
different from what
00:09:52
I thought it is.
00:09:54
I knows, well, I don't
00:09:56
even know if the side
00:09:57
channel attack would be
00:09:58
correct name for that.
00:09:59
But basically when you
00:10:00
try to bring CPUs and
00:10:05
stuff into a hiccup by
00:10:07
giving them the wrong
00:10:08
signal at the wrong
00:10:08
time, at very specific
00:10:10
timings, and then you
00:10:12
just jump over certain
00:10:14
instructions or stuff.
00:10:15
Yes.
00:10:17
What you mean
00:10:17
that also exists.
00:10:18
These are the
00:10:19
hardware based side
00:10:20
channel attacks.
00:10:21
Okay.
00:10:22
But it is a side
00:10:22
channel attack.
00:10:23
Okay.
00:10:23
Yeah.
00:10:23
I wasn't-
00:10:24
Now I wasn't a
00:10:25
hundred percent
00:10:25
sure if I'm actually
00:10:26
correct about that.
00:10:28
Yes, they are also
00:10:29
considered side
00:10:30
channel attacks.
00:10:31
Even though nowadays we
00:10:32
mostly call them fault
00:10:33
attacks because you
00:10:35
think of something and
00:10:36
then you induce a fault
00:10:39
in the CPU, it skips
00:10:40
an instruction, it does
00:10:41
a wrong calculation,
00:10:43
stuff like that.
00:10:44
Right, fault injection.
00:10:45
That was the term
00:10:46
I was looking for.
00:10:47
Right.
00:10:47
Right.
00:10:48
Cool.
00:10:49
Yeah.
00:10:49
I love the example
00:10:51
with the neighbor
00:10:52
because that's just
00:10:53
incredibly visual for
00:10:55
everyone to understand.
00:10:58
But your team, you
00:10:59
worked on something
00:11:00
very specific, which
00:11:01
was the CacheWarp.
00:11:02
And I think I was
00:11:03
like, released two
00:11:04
years ago, a year and
00:11:06
a half ago, something?
00:11:08
I think it was last
00:11:09
year, November.
00:11:10
Or maybe last
00:11:11
year in November.
00:11:12
Maybe just say a few
00:11:13
words about that.
00:11:14
I think I read
00:11:15
about that on Haiza.
00:11:16
I was actually
00:11:17
not aware you guys
00:11:18
are behind that.
00:11:19
So it was really
00:11:20
interesting when
00:11:21
I got the chance
00:11:22
to talk to you.
00:11:25
Yes, so this is a
00:11:27
really nice attack
00:11:28
and it shows something
00:11:29
very interesting.
00:11:31
We have CPUs for a
00:11:32
long time and we're
00:11:34
adding features on
00:11:36
top, on top, on top.
00:11:38
And sometimes you're
00:11:39
forgetting what we
00:11:40
already added back
00:11:41
then, let's say
00:11:42
in the eighties.
00:11:43
And it's still, there's
00:11:44
some legacy stuff
00:11:46
nobody dares to touch.
00:11:48
And we are adding
00:11:50
new features,
00:11:50
forgetting about
00:11:51
the legacy features
00:11:52
and also forgetting
00:11:53
to think about how
00:11:54
they could interact.
00:11:55
And CacheWarp
00:11:57
is a really nice
00:11:58
example of that.
00:12:00
So this targets
00:12:00
the newest AMT
00:12:02
CPUs of the Trusted
00:12:03
Execution Environment
00:12:04
used in the cloud.
00:12:06
This SEV.
00:12:08
And this SEV has the
00:12:10
security guarantees
00:12:11
saying like everything
00:12:13
you run in there is
00:12:14
secure, even if your
00:12:16
cloud provider is
00:12:18
malicious or doesn't
00:12:19
have to be malicious,
00:12:20
but could be hacked.
00:12:23
Even with the
00:12:23
permission of the
00:12:25
cloud providers, you
00:12:25
have no way of seeing
00:12:27
what is running inside
00:12:28
the virtual machine.
00:12:29
The way it works is
00:12:30
that it encrypts every
00:12:33
virtual machine with
00:12:34
a specific hardware,
00:12:35
well, with the AES
00:12:37
key, I think, right?
00:12:38
Yes, exactly.
00:12:39
Yeah.
00:12:39
So, it encrypts.
00:12:40
It also attests that
00:12:42
it's running on actual
00:12:44
read hardware and
00:12:45
it's not emulated in
00:12:46
some way and gives
00:12:48
you, in theory, pretty
00:12:49
good guarantees that
00:12:52
everything you run there
00:12:53
cannot be modified,
00:12:54
cannot be seen.
00:12:56
And this looked fine,
00:12:59
but then we started
00:13:00
looking at this
00:13:02
interaction, like,
00:13:02
okay, we built this
00:13:03
super really cool stuff.
00:13:05
We have that in
00:13:06
the CPUs now, and
00:13:06
it's pretty new.
00:13:08
And on the other side,
00:13:09
we have instructions
00:13:10
in the CPU that were
00:13:12
introduced in the
00:13:13
beginning of x86.
00:13:16
And nobody uses them
00:13:17
anymore because we
00:13:18
don't have use cases for
00:13:19
them anymore in modern
00:13:20
operating systems.
00:13:22
And then you even read
00:13:23
the manuals about this
00:13:24
instruction scene.
00:13:25
Like what do they do
00:13:27
with the instructions?
00:13:28
They still exist.
00:13:29
That's the nice
00:13:30
thing about x86 being
00:13:32
backward compatible
00:13:33
all the way back.
00:13:35
And while I'm wondering,
00:13:36
like, what happens
00:13:37
if you use these
00:13:38
instructions that
00:13:39
are useless nowadays?
00:13:40
And the manual was
00:13:41
like, don't use them.
00:13:43
I'm like- that's,
00:13:46
interesting.
00:13:47
So it doesn't say
00:13:48
anything about it would
00:13:50
prevent us from using
00:13:51
them, which is like, if
00:13:53
you use modern features
00:13:54
like multi core, do not
00:13:56
use them because they
00:13:58
don't work as expected.
00:14:01
I see where
00:14:01
this is going.
00:14:03
Let's see.
00:14:04
I mean, if somebody
00:14:04
tells me to not
00:14:05
do something,
00:14:08
my first thing is like,
00:14:11
now I'm even, now I'm
00:14:11
doing it even harder.
00:14:13
So let's see what
00:14:14
is happening.
00:14:15
And that's exactly
00:14:17
what we tried.
00:14:18
And my student was
00:14:21
the main driving force
00:14:22
behind that, Ray.
00:14:24
I told him like,
00:14:25
look, this could be
00:14:26
very interesting.
00:14:28
See what happens.
00:14:29
And at first, first try,
00:14:32
I said, okay, we just
00:14:33
run this instruction.
00:14:34
Everything crashed.
00:14:36
It's like, oh,
00:14:38
that is not a real
00:14:40
problem yet, but
00:14:42
definitely interesting.
00:14:44
I don't know if cloud
00:14:45
providers would agree
00:14:46
with that sentiment.
00:14:48
Yeah, well, I mean,
00:14:50
if you're the cloud
00:14:50
provider making sure
00:14:52
something is not working
00:14:53
anymore, that's easy.
00:14:54
You could also
00:14:54
take a hammer.
00:14:55
Okay, fair.
00:14:58
Well, it's like, that's
00:15:00
a starting point.
00:15:00
That's pretty
00:15:01
interesting.
00:15:02
And then we're starting
00:15:04
in investigating
00:15:05
that, making theories,
00:15:06
what could happen.
00:15:07
So talking technical
00:15:09
detail, what does
00:15:10
this instruction do?
00:15:13
You have the DRAM
00:15:14
where you store all
00:15:15
your memory, and then
00:15:16
you have the cache
00:15:16
inside the CPU that
00:15:18
stores recently used
00:15:19
copies of the data
00:15:21
you have in memory to
00:15:22
make things faster.
00:15:23
And also if you
00:15:24
modify data in your
00:15:26
applications, they first
00:15:27
are modified just in
00:15:28
the cache and, let's say
00:15:30
if you have time, they
00:15:31
are written back to the
00:15:32
real big main memory.
00:15:35
And what this
00:15:36
instruction does,
00:15:37
it clears the cache.
00:15:40
And it was like,
00:15:41
yeah, it's a legend
00:15:42
use case, for some
00:15:45
reasons when booting
00:15:47
a server, for example,
00:15:48
that you set everything
00:15:49
in an own state.
00:15:52
But what if you're
00:15:53
really running
00:15:53
now machines,
00:15:54
they modify data.
00:15:56
They only do that
00:15:56
inside the cache first,
00:15:58
and then we get rid
00:15:59
of all the content.
00:16:01
Is the modification
00:16:02
also lost?
00:16:03
And the short
00:16:04
answer, yes.
00:16:06
So for any virtual
00:16:08
machine, if this virtual
00:16:09
machine modifies some
00:16:10
data somewhere, updating
00:16:12
something, we can run
00:16:14
this instruction and
00:16:15
then this modification
00:16:16
is reverted.
00:16:17
So we can rollback to an
00:16:18
old state of the data.
00:16:20
And you can run that
00:16:21
from any thread, even
00:16:24
from a different core?
00:16:25
How does that work?
00:16:27
You can do that
00:16:29
for basically
00:16:30
everything, yes.
00:16:31
Wow.
00:16:32
So this instruction was
00:16:34
designed at a time when
00:16:35
there was only one core.
00:16:37
Right, yeah.
00:16:37
And it also says in
00:16:38
the manual, if you
00:16:39
have more than one
00:16:40
core, it's undefined
00:16:42
what will happen.
00:16:45
My favorite behavior.
00:16:47
Yes, don't do it if you
00:16:49
have more than one core.
00:16:50
Okay.
00:16:53
And, but, even if it's
00:16:55
limited to one core as
00:16:57
a cloud provider, you
00:16:58
can easily schedule the
00:17:00
VM you want to attack
00:17:01
to this core and then
00:17:02
do it on this core.
00:17:06
And this is what
00:17:07
CacheWarp essentially
00:17:09
can do is reverting
00:17:10
modifications.
00:17:11
So you modify something
00:17:13
and as an attacker, you
00:17:14
go back to the old data.
00:17:16
And this is
00:17:17
something that does
00:17:18
not sound powerful
00:17:19
at all at first.
00:17:21
Because like, yeah,
00:17:22
well, the data was
00:17:23
also there before.
00:17:25
No harm done.
00:17:26
And the interesting
00:17:27
thing is you do
00:17:28
that selectively, so
00:17:30
you're not reverting
00:17:31
everything, so you're
00:17:32
not like restoring
00:17:33
a snapshot of your
00:17:34
virtual machine, but
00:17:36
only of partial parts
00:17:38
of the memory that you
00:17:39
can directly target.
00:17:41
And then you get
00:17:43
really nice effects
00:17:43
based on also how
00:17:45
we write programs.
00:17:47
So if you're thinking
00:17:48
about programs,
00:17:51
for example, the
00:17:52
pseudobinary,
00:17:53
which elevates your
00:17:54
privileges to root
00:17:55
for an operation,
00:17:58
how does it do that?
00:17:59
Well, it asks the
00:18:00
operating system,
00:18:01
am I already root?
00:18:03
If so, it continues
00:18:04
with root.
00:18:05
Otherwise it asks for a
00:18:06
password, for example.
00:18:08
How is that implemented?
00:18:11
The root user has
00:18:13
this idea of zero.
00:18:15
When we write
00:18:15
programs, stuff is
00:18:17
typically initialized
00:18:19
at zero first.
00:18:20
So the permission is
00:18:23
zero because that's
00:18:24
how we start with
00:18:25
variables in memory.
00:18:27
Then the sudo program
00:18:28
asks the operating
00:18:29
system, like, Who am I?
00:18:31
The operating system
00:18:32
gives back the number
00:18:33
zero for root, another
00:18:36
number for any other
00:18:37
user, updates this
00:18:39
value in memory, use
00:18:41
CacheWarp, revert
00:18:42
it to zero, and
00:18:44
then we are root,
00:18:45
because it was also
00:18:46
initialized like that.
00:18:47
And this is not only
00:18:49
like a specific case
00:18:50
of sudo, this is many
00:18:52
cases how we write
00:18:53
programs, we also,
00:18:55
for error checking,
00:18:56
we started like,
00:18:57
we had no error.
00:18:59
And then we check
00:19:00
certain things and if
00:19:01
we have an error, then
00:19:02
we update that like
00:19:03
we had an error, but
00:19:04
we can revert that.
00:19:06
And even if there was
00:19:07
an error, there's no
00:19:09
trace of that anymore
00:19:10
and we continue.
00:19:12
Right, right.
00:19:13
The second you said it,
00:19:15
it asked the operating
00:19:16
system, what user am I?
00:19:18
I was like, Oh yeah.
00:19:19
Okay.
00:19:19
ID zero.
00:19:20
I see.
00:19:23
That is good because
00:19:24
I, to be honest, I, saw
00:19:25
the CacheWarp exploit.
00:19:28
I did not really see how
00:19:30
this was used and now
00:19:32
it makes total sense.
00:19:33
Yeah.
00:19:34
Because I was also
00:19:35
like, how does it-
00:19:37
my understanding was
00:19:39
that you can actually
00:19:40
reset it to a different
00:19:41
previous state and I
00:19:42
never understood how
00:19:43
that worked, but you
00:19:44
basically just clear it
00:19:45
out and it's all zero.
00:19:47
And if you want
00:19:48
something to be zero,
00:19:51
that's the way to go.
00:19:53
Yes.
00:19:53
Yes.
00:19:53
Right.
00:19:54
Right.
00:19:58
Now you got me.
00:20:01
Wow.
00:20:02
I,
00:20:04
that is just brilliant.
00:20:07
I have so many use
00:20:07
cases for that now.
00:20:09
Thanks.
00:20:10
Of course, a lot of
00:20:11
the brilliance is then
00:20:13
still needed for finding
00:20:14
targets, but like,
00:20:15
okay, I can reset that
00:20:17
back to this value.
00:20:18
Where exactly do I
00:20:19
do that in a program
00:20:20
that it gives me
00:20:21
exactly what I want?
00:20:23
But we showed
00:20:24
quite a few cases
00:20:25
where that works.
00:20:25
You only have to time
00:20:27
it correctly, right?
00:20:28
So you have to figure
00:20:29
out when is the correct
00:20:31
point in time when
00:20:33
sudo would actually
00:20:34
ask like, Hey, give
00:20:35
me this operation
00:20:36
or give me this ID.
00:20:37
Yes.
00:20:38
Yes.
00:20:39
So that means you
00:20:41
basically create,
00:20:43
well, not a remote code
00:20:45
execution per se, but
00:20:48
you could probably make
00:20:49
that happen as well.
00:20:51
But you gain,
00:20:55
well, privilege
00:20:56
escalation, basically.
00:20:57
You gain root
00:20:58
permission.
00:20:59
So the next step would
00:21:00
be to find something
00:21:01
else to inject it
00:21:02
into a memory and say,
00:21:03
please execute that.
00:21:04
And I guess with Cache,
00:21:07
you could probably do
00:21:08
the same thing by just
00:21:10
making sure you're
00:21:11
redirecting to the right
00:21:12
memory location now.
00:21:15
Yes.
00:21:15
So, this is one thing
00:21:16
you can work on a very
00:21:18
low level and also the
00:21:19
CPU remembers certain
00:21:21
things, like when you
00:21:23
go to some different
00:21:24
part of the code, how
00:21:25
to return, you can reset
00:21:27
that and you return
00:21:27
to some other place.
00:21:30
Makes it quite nice.
00:21:32
But we also showed
00:21:32
like this full
00:21:33
chain end to end
00:21:34
exploit in two steps.
00:21:37
So you have some way
00:21:39
to log into a server.
00:21:40
SSH, typically.
00:21:42
There's a
00:21:43
password check.
00:21:44
And with CacheWarp,
00:21:47
we trick that password
00:21:48
check into believing
00:21:49
that it does not
00:21:50
matter what we enter.
00:21:51
Any password is correct.
00:21:53
Then we are logged
00:21:54
in as a normal user.
00:21:56
Then you use it
00:21:56
again on sudo.
00:21:58
And then we are logged
00:21:59
into a server, into
00:22:00
any virtual machine
00:22:01
with root privileges.
00:22:03
Then we can just
00:22:05
execute whatever we
00:22:06
want there as root.
00:22:07
So full control
00:22:08
of the VM.
00:22:10
this entire thing takes
00:22:12
just a few seconds.
00:22:13
Wow.
00:22:14
And we had over 90
00:22:16
percent reliability.
00:22:19
And you basically
00:22:21
started with getting
00:22:23
a virtual machine on
00:22:24
some shared resource
00:22:25
and that's just how
00:22:27
you get into it.
00:22:28
It's hard to be precise,
00:22:29
I guess, to target
00:22:30
a specific company,
00:22:31
but you never know
00:22:32
who's alongside you.
00:22:34
Exactly.
00:22:35
Exactly.
00:22:37
Right.
00:22:37
I mean, we also don't
00:22:38
want to do that.
00:22:38
This is where the
00:22:40
academic part also ends.
00:22:42
We kind of already
00:22:43
overstepped it a bit
00:22:44
with showing like step
00:22:46
by step how to get
00:22:47
from this problem, this
00:22:50
logic problem in the
00:22:51
CPU to full, to taking
00:22:54
over an entire VM.
00:22:56
We are not going into
00:22:57
details like how to
00:22:59
attack a specific
00:23:00
company even though
00:23:00
a lot of my students
00:23:02
also ask these things
00:23:03
like, I now want to
00:23:04
break into Microsoft.
00:23:05
What do I do?
00:23:07
But this is not what
00:23:09
we should do And
00:23:09
also not what we do.
00:23:11
That makes sense.
00:23:12
It was more like
00:23:12
a rhetorical
00:23:13
question, obviously.
00:23:17
Wow.
00:23:18
That is quite something,
00:23:20
especially it comes
00:23:22
about two years after,
00:23:23
or two and a half
00:23:24
years after Spectre and
00:23:27
Meltdown, which were
00:23:28
the other big things.
00:23:30
And as you said,
00:23:31
CPUs become more
00:23:32
and more complex.
00:23:33
At that point in time,
00:23:34
it was the predictor,
00:23:38
making guesses
00:23:39
where to jump next.
00:23:40
And it was the
00:23:41
same thing.
00:23:42
We, you do those things
00:23:43
to speed up CPUs and
00:23:45
to make them better.
00:23:46
But now you have all
00:23:49
those features and
00:23:51
people all just figure
00:23:52
out how to use them.
00:23:54
I think people are
00:23:55
probably more familiar
00:23:56
with Meltdown and
00:23:57
Spectre because it
00:23:57
was like, if you want
00:24:00
to say it was a big
00:24:03
F up, because also it
00:24:05
involved AMD, Intel, and
00:24:07
even ARM CPUs, as far
00:24:09
as I remember, right?
00:24:10
ARM was a little bit
00:24:11
late to the game, but
00:24:11
people figured it out
00:24:12
how to do it as well.
00:24:14
And the meltdown
00:24:16
thing was like three
00:24:17
different CVs or
00:24:19
four, even, different-
00:24:21
In the beginning, one
00:24:23
and two for Spectre.
00:24:25
Or two for, yeah.
00:24:26
It was a couple
00:24:27
of iterations.
00:24:29
From your perspective,
00:24:30
which one is worse?
00:24:32
It's a very difficult
00:24:33
question, but also
00:24:35
a really interesting
00:24:36
one to think about.
00:24:37
So when we-
00:24:40
So Meltdown Spectre, we
00:24:41
published that in 2018,
00:24:43
beginning of 2018, we
00:24:45
discovered it in 2017.
00:24:47
Back then, we did not
00:24:49
fully understand the
00:24:50
consequences of that.
00:24:53
Now in hindsight,
00:24:54
I would say, It
00:24:56
really depends.
00:24:57
So Meltdown is something
00:25:00
that has a huge impact
00:25:02
or had a huge impact,
00:25:03
but luckily, like the
00:25:04
year 2k bug, nobody
00:25:06
saw the impact, because
00:25:09
that was disclosed in
00:25:10
June, 2017 and made
00:25:13
public in January, 2018.
00:25:16
And so all the vendors
00:25:18
had time to work on
00:25:20
fixes, workarounds.
00:25:22
And when it was
00:25:23
public, we had already
00:25:26
systems protected
00:25:27
against exploitation.
00:25:28
so the issue were not
00:25:29
fixed, but at least
00:25:31
exploitation was made
00:25:33
difficult to impossible,
00:25:34
depending on the system.
00:25:35
So we did not
00:25:36
see the impact.
00:25:37
If that hadn't happened,
00:25:40
that would have been
00:25:40
a huge impact because
00:25:42
the code mounting that
00:25:43
is so extremely easy.
00:25:45
Back then, our group was
00:25:48
part of the discovery.
00:25:49
We printed t shirts.
00:25:51
And we could fit
00:25:52
the entire exploit
00:25:52
code on a t shirt.
00:25:54
I remember that t shirt.
00:25:56
But that was at least
00:25:58
easy to mitigate.
00:26:00
Spectre on the other
00:26:00
side is really difficult
00:26:02
and we still have that.
00:26:03
And we still have
00:26:04
to deal with that.
00:26:05
We still don't have
00:26:06
real solutions.
00:26:07
but it's also way
00:26:08
harder to exploit
00:26:10
for an attacker.
00:26:11
That's similar to
00:26:12
stuff you have on
00:26:13
software security.
00:26:15
Let's say easier things
00:26:17
like buffer overflows.
00:26:18
We know them since
00:26:19
the eighties.
00:26:20
We know how to
00:26:21
exploit them.
00:26:22
It gets harder and
00:26:23
harder to exploit.
00:26:24
We still have the bugs.
00:26:26
We can't fundamentally
00:26:27
fix them.
00:26:28
We could, but we don't.
00:26:31
But also, we
00:26:32
live with that.
00:26:33
And similar with
00:26:33
Spectre, we don't
00:26:34
know how to fix that.
00:26:36
But as it's so difficult
00:26:37
to exploit, we kind
00:26:38
of live with that.
00:26:40
Meltdown would be easy.
00:26:41
So we had to do
00:26:42
something immediately
00:26:43
and luckily also found
00:26:45
ways for fixing that.
00:26:47
CacheWarp goes in
00:26:48
a similar direction
00:26:49
as Meltdown.
00:26:51
It's very easy
00:26:52
to exploit.
00:26:53
It's extremely powerful,
00:26:55
but luckily also AMD
00:26:57
was able to fix that
00:26:59
very quickly, Let's
00:27:01
say within half a year.
00:27:02
But it's not the
00:27:05
greatest fix, but
00:27:06
by removing other
00:27:07
functionality, nowadays
00:27:10
we are used to that,
00:27:11
that our CPUs are
00:27:12
losing functions with
00:27:13
security updates.
00:27:16
At least we have
00:27:16
a workaround.
00:27:17
it's not
00:27:17
exploitable anymore.
00:27:19
I hope cloud providers
00:27:22
also deploy that.
00:27:23
we cannot check.
00:27:25
But at least there
00:27:27
is something where we
00:27:28
could ensure nobody
00:27:29
can exploit it anymore.
00:27:31
So I guess it's a
00:27:32
microcode update and
00:27:33
you basically just load
00:27:34
that, through the Linux
00:27:37
kernel or whatever.
00:27:38
Yeah, that makes sense.
00:27:40
So you think Spectre is
00:27:42
the worst one because
00:27:44
it's hard to fix?
00:27:46
Because this is a
00:27:48
design issue, it's
00:27:50
inherent to the design
00:27:52
versus the others
00:27:53
are implementation
00:27:54
issues where somebody
00:27:56
made a mistake
00:27:56
implementing something.
00:27:58
Right.
00:27:58
Okay.
00:27:58
That makes sense.
00:28:00
Yeah, we're almost-
00:28:02
Well, we are
00:28:03
out of time.
00:28:05
But one last question.
00:28:07
The one that I've
00:28:08
always asked, like,
00:28:09
what do you think is
00:28:10
the next big thing?
00:28:11
Are you working on
00:28:11
something, CacheWarp 2?
00:28:16
Always and even worse.
00:28:18
Of course I can't give
00:28:20
any details on that.
00:28:23
But yes, we just started
00:28:25
to scratch the surface
00:28:27
of all these problems.
00:28:29
And this is also not
00:28:31
surprising if you
00:28:32
think about that.
00:28:33
We had software bugs
00:28:34
in for years, decades.
00:28:37
Nobody's surprised
00:28:38
if there's a bug
00:28:38
in a software.
00:28:39
We need a patch.
00:28:41
Everybody is suddenly
00:28:42
surprised that we
00:28:42
have that in the CPUs.
00:28:44
But also hardware
00:28:45
is nowadays just
00:28:46
written in software.
00:28:48
We have hardware
00:28:49
description languages.
00:28:50
These are just
00:28:51
programming languages
00:28:52
for hardware.
00:28:54
We combine our CPU.
00:28:56
So program takes
00:28:57
care of taking our
00:28:58
description, making
00:28:59
that into hardware.
00:29:01
Simplified.
00:29:03
Of course it's also
00:29:04
written by humans
00:29:05
like software.
00:29:06
Humans make mistakes.
00:29:08
That's something
00:29:09
we will always see.
00:29:11
And so it's not a
00:29:12
surprise that we see
00:29:13
a lot of problems
00:29:14
in CPUs as well.
00:29:15
We will find more and
00:29:17
more over the years.
00:29:20
So far, we were
00:29:22
relatively lucky that
00:29:23
we could always add some
00:29:26
quick fix mitigation
00:29:28
that's disabled some
00:29:29
functionality maybe, but
00:29:30
prevented exploitation.
00:29:33
I hope it stays that
00:29:34
way, but I fear not.
00:29:37
And at some point
00:29:37
we will see these
00:29:38
big problems that
00:29:40
we cannot fix.
00:29:42
And this is really
00:29:43
bad because we cannot
00:29:45
simply do an update
00:29:46
like with software.
00:29:47
And then we have to,
00:29:49
well, think about
00:29:50
what to do, right?
00:29:51
And I guess the thing
00:29:52
goes for the ever
00:29:54
more complex Graphics
00:29:57
cards, especially
00:29:58
when you go into the
00:29:59
AI accelerators and
00:30:01
stuff like that, right?
00:30:02
Those things
00:30:04
grow, and get more
00:30:06
complex day by day.
00:30:09
Yes.
00:30:09
Yes.
00:30:10
We are currently mostly
00:30:12
adding complexity
00:30:14
and not trying to
00:30:14
simplify things.
00:30:16
Right.
00:30:17
Complexity
00:30:18
introduces problems.
00:30:21
That is very true.
00:30:22
and every software
00:30:23
developer, I
00:30:24
guess, knows that.
00:30:25
I was kind of shocked
00:30:27
how way I was off
00:30:28
with the Meltdown
00:30:30
and Spectre timing.
00:30:32
Seems like COVID
00:30:33
completely messed with
00:30:34
my feeling for time.
00:30:36
Yeah.
00:30:36
It also feels like
00:30:37
yesterday for me.
00:30:39
I can't imagine that.
00:30:40
Yeah.
00:30:41
Thank you very much.
00:30:42
It was a pleasure
00:30:42
having you.
00:30:43
I hope you, I have
00:30:44
the chance to have
00:30:45
you back somewhere in
00:30:46
the future, after the
00:30:48
next big exploit you
00:30:49
guys were working on.
00:30:51
Because that is
00:30:52
just like incredibly
00:30:53
interesting.
00:30:54
And I think for our
00:30:56
audience, which is
00:30:58
often cloud users, it's
00:30:59
also very relevant.
00:31:02
Thanks for having me.
00:31:03
It was a pleasure
00:31:04
talking to you
00:31:04
about that.
00:31:05
And yes, I'd be
00:31:06
happy to be back.
00:31:07
All right.
00:31:08
Thank you very much for
00:31:10
the audience as well.
00:31:11
Next week, same
00:31:12
time, same place.
00:31:14
And I hope you're
00:31:15
listening in again.
00:31:16
And thank you very much
00:31:17
for being here as well.
00:31:21
The Cloud Commute
00:31:22
Podcast is sponsored
00:31:23
by Simplyblock.
00:31:24
Your own elastic
00:31:25
block storage engine
00:31:25
for the cloud.
00:31:26
Get higher IOPS and
00:31:27
low predictable latency
00:31:29
while bringing down your
00:31:30
total cost of ownership.
00:31:31
www.simplyblock.io

