Building and operating a production-grade PostgreSQL in Kubernetes - Álvaro Hernández Tortosa from OnGres
Cloud CommuteApril 05, 2024x
6
00:24:4422.66 MB

Building and operating a production-grade PostgreSQL in Kubernetes - Álvaro Hernández Tortosa from OnGres

Álvaro Hernández Tortosa from OnGres, a company providing the necessary tooling to run a production-grade PostgreSQL in Kubernetes. Álvaro talks about the issues with the many-solutions-to-a-problem situation in Postgres and how it can be overwhelming and alarming to newcomers.

For questions, you can reach Álvaro at:

If you want to learn more about OnGres or StackGres:

The Cloud Commute Podcast is presented by simplyblock (https://www.simplyblock.io)


01:00:01
So there's a technology we're working on

01:00:04
right now that works for our use case,

01:00:07
but will work for many use cases also,

01:00:09
which is what we're calling dynamic

01:00:11
containers.

01:00:16
You're listening to simplyblock's Cloud Commute Podcast,

01:00:18
your weekly 20 minute

01:00:19
podcast about cloud technologies,

01:00:21
Kubernetes, security,

01:00:22
sustainability, and more.

01:00:24
Hello, everyone.

01:00:25
Welcome to this week's episode of

01:00:28
Cloud Commute podcast

01:00:30
by simplyblock.

01:00:32
Today, I have another incredible

01:00:34
guest, a really good friend,

01:00:37
Álvaro Hernández from OnGress.

01:00:40
He's very big in the

01:00:41
Postgres community.

01:00:42
Well, probably

01:00:43
podcast community as well.

01:00:44
I don't know.

01:00:45
But at least in the

01:00:46
Postgres community.

01:00:48
So hello, and welcome, Álvaro.

01:00:52
Thank you.

01:00:53
Thank you very much, first of all,

01:00:54
for having me here.

01:00:55
It's an honor.

01:00:58
Oh, I don't know about that.

01:00:59
We'll see.

01:01:01
But OK, let's see.

01:01:03
Maybe just start by introducing

01:01:05
yourself, who you are,

01:01:07
what you've done in the

01:01:08
past, how you got here.

01:01:12
Well, except me inviting you.

01:01:15
OK, well, I don't know how to

01:01:17
describe myself,

01:01:17
but I would say, first of all, I'm

01:01:19
a big nerd, big fan

01:01:20
of open source.

01:01:22
And I've been working with

01:01:23
Postgres, I don't know,

01:01:25
for more than 20

01:01:25
years, 24 years now.

01:01:28
So I'm a big Postgres person.

01:01:30
There's someone out there in the

01:01:32
community that says that if you

01:01:33
say Postgres three

01:01:34
times, I will pop up there.

01:01:35
It's kind of like Superman or

01:01:37
Batman or these superheroes.

01:01:38
No, I'm not a superhero.

01:01:40
But anyway, professionally, I'm

01:01:43
the founder and CEO

01:01:44
of a company called OnGress.

01:01:47
Let's guess what it

01:01:47
means on Postgres.

01:01:50
So it's pretty

01:01:51
obvious what we do it.

01:01:54
So everything

01:01:55
revolves around Postgres,

01:01:56
but in reality, I love

01:01:57
all kind of technology.

01:01:58
I've been working a lot with many

01:02:01
other technologies.

01:02:02
I know you, right, because of

01:02:04
being a Java programmer, which

01:02:05
is kind of my hobby work.

01:02:08
I love programming in my free

01:02:09
time, which almost doesn't exist.

01:02:11
But I try to get

01:02:13
some from time to time.

01:02:14
And everything related to

01:02:16
technology in general,

01:02:17
I'm also a big fan and

01:02:19
supporter of open source.

01:02:20
I have contributed

01:02:21
and keep contributing

01:02:22
a lot to open source.

01:02:24
I also founded some open source

01:02:26
communities, like for example,

01:02:28
I'm a Spaniard.

01:02:28
I live in Spain.

01:02:30
And I founded Debian Spain, the

01:02:32
association like, I don't know,

01:02:34
20 years ago.

01:02:35
More recently, I also

01:02:37
founded a foundation,

01:02:38
a nonprofit

01:02:39
foundation also in Spain called

01:02:41
Fundación PostgreSQL.

01:02:43
Again, guess what it does?

01:02:46
And I try to engage a lot with the

01:02:49
open source communities.

01:02:49
We, by the way,

01:02:50
organized a conference

01:02:51
for those who are

01:02:52
interested for Postgres

01:02:54
in the magnificent island of Ibiza

01:02:56
in the Mediterranean Sea in September

01:02:58
this year, 9th to 11th

01:03:01
September for

01:03:01
those who want to join.

01:03:02
So yeah, that's probably a brief

01:03:05
intro about myself.

01:03:07
All right.

01:03:07
So you are basically the

01:03:09
Beetlejuice of Postgres.

01:03:10
That's what you're saying.

01:03:12
Beetlejiuce, right, right.

01:03:13
That's more upper

01:03:13
bid than superheroes.

01:03:14
You're absolutely right.

01:03:16
I'm not sure if

01:03:17
he is a superhero,

01:03:18
but he's different at least.

01:03:21
Yes, he is.

01:03:22
You mentioned OnGres.

01:03:24
And I know OnGres isn't really

01:03:26
like the first company.

01:03:27
There were quite a

01:03:28
few before, I think,

01:03:31
El Toro, a database company.

01:03:33
Toro DB.

01:03:33
Oh, Toro DB.

01:03:34
Sorry, close, close, very close.

01:03:37
So what is up with that?

01:03:39
You've been around--

01:03:40
you're trying to do a lot of

01:03:42
different things

01:03:43
and seem to love

01:03:44
trying new things, right?

01:03:47
Yes, yes, yes.

01:03:48
So I sometimes define myself as a

01:03:51
0.x serial entrepreneur,

01:03:54
meaning that I've tried several

01:03:56
ventures and sold none of them.

01:03:59
But I'm still trying.

01:04:03
I'm still trying.

01:04:04
I like to try to be resilient, and

01:04:06
I keep pushing the ideas

01:04:08
that I have in

01:04:09
the back of my head.

01:04:10
So yes, yes, I've done several

01:04:13
ventures, all of them,

01:04:15
around certain patterns.

01:04:19
So for example, you're

01:04:19
asking about Toro DB.

01:04:21
Toro DB is essentially

01:04:23
an open source software

01:04:24
that is meant to replace

01:04:27
Mongo DB with, you guessed it,

01:04:30
Postgres, right?

01:04:33
There's a certain pattern in my

01:04:35
professional life.

01:04:36
And the idea of

01:04:37
Toro DB, Toro DB had--

01:04:39
and I've been speaking in the past

01:04:40
because it no longer unfortunately

01:04:41
maintained open source project.

01:04:43
We moved on to something else,

01:04:45
which is OnGres.

01:04:47
But the idea of Toro DB was to

01:04:50
essentially replicate

01:04:51
from Mongo DB live these documents

01:04:54
and in the process,

01:04:55
real time, transform them

01:04:56
into a set of

01:04:57
relational tables that got

01:04:59
stored inside of a

01:05:00
Postgres database.

01:05:01
So it enabled you to do SQL

01:05:03
queries on your documents that

01:05:06
were Mongo DB.

01:05:07
So think of a Mongo DB replica.

01:05:08
You can keep your

01:05:09
Mongo DB class if you want,

01:05:10
and then you have

01:05:11
all the data in SQL.

01:05:12
This was great for analytics.

01:05:14
You could have great speed ups by

01:05:16
normalizing data

01:05:17
automatically and

01:05:18
then doing queries

01:05:19
with the power of SQL, which

01:05:21
obviously is much broader

01:05:24
and richer than

01:05:25
query language Mongo DB,

01:05:26
especially for analytics.

01:05:27
We got like 100 times

01:05:29
speed up on most queries.

01:05:31
So it was an interesting project.

01:05:34
So that means you basically

01:05:36
generated the schema on the fly

01:05:38
and then generated the table for

01:05:40
that schema specifically.

01:05:42
Interesting.

01:05:43
Yeah, it was generating

01:05:43
tables and columns on the fly.

01:05:46
Right.

01:05:47
OK, interesting.

01:05:49
So now you're doing

01:05:51
the OnGres thing.

01:05:52
And OnGres has, I

01:05:53
think, the main product

01:05:55
is StackGres, as far as I know.

01:05:58
Can you tell a

01:05:59
little bit about that?

01:06:00
Yes.

01:06:01
So OnGres, as I

01:06:03
said, means on Postgres.

01:06:04
And one of our goals in

01:06:05
OnGres is that we believe

01:06:07
that Postgres is a

01:06:08
fantastic database.

01:06:09
I don't need to

01:06:09
explain that to you, right?

01:06:12
But it's kind of the Linux kernel,

01:06:15
if I may use this parallel.

01:06:17
It's a bit bare bones.

01:06:18
You need something around it.

01:06:20
You need a distribution, right?

01:06:22
So Postgres is a

01:06:24
little bit the same thing.

01:06:25
The core is small, it's fantastic,

01:06:27
it's very featureful,

01:06:28
it's reliable, it's trustable.

01:06:30
But it needs tools around it.

01:06:31
So our vision in OnGres is to

01:06:33
develop this ecosystem

01:06:34
around this Postgres core, right?

01:06:37
And one of the

01:06:38
things that we experience

01:06:39
during our professional lifetime

01:06:42
is that Postgres

01:06:44
requires a lot of tools around it.

01:06:47
It needs

01:06:47
monitoring, it needs backups,

01:06:49
it needs high availability, it

01:06:51
needs connection pooling.

01:06:53
By the way, do not use Postgres

01:06:55
without connection pooling,

01:06:56
right?

01:06:57
So you need kind of a

01:06:58
lot of tools around.

01:06:59
And none of these tools

01:07:00
come from with a core.

01:07:02
You need to look

01:07:02
into the ecosystem.

01:07:04
And actually,

01:07:04
this is good and bad.

01:07:06
It's good because

01:07:06
there's a lot of options.

01:07:07
It's bad because

01:07:08
there's a lot of options.

01:07:09
Meaning which one to

01:07:10
choose, which one is good,

01:07:12
which one is bad, which one goes

01:07:13
with a good backup solution

01:07:15
or the good monitoring solution

01:07:16
and how you configure them all.

01:07:18
So this was a problem that we

01:07:19
coined as a stack problem.

01:07:22
So when you really want to run

01:07:23
Postgres in production,

01:07:24
you need the stack on top of

01:07:25
Postgres, right?

01:07:26
To orchestrate all

01:07:28
these components.

01:07:29
Now, the problem is

01:07:30
that we've been doing this

01:07:31
a lot of time for our customers.

01:07:33
Typically, we love

01:07:34
infrastructure score, right?

01:07:35
And everything was done with

01:07:36
Ansible and similar tools

01:07:39
and Terraform for

01:07:39
infrastructure and Ansible

01:07:40
for orchestrating

01:07:42
these components.

01:07:43
But the reality is

01:07:44
that every environment

01:07:45
into which we looked at was

01:07:47
slightly different.

01:07:48
And we can just take

01:07:49
our Ansible code and say,

01:07:50
yeah, run it.

01:07:51
You've got this stack.

01:07:53
No, because your

01:07:54
storage is different.

01:07:55
Your networking is different.

01:07:56
Your entry point.

01:07:57
Here, one is using VPS.

01:08:00
Sorry, virtual IPs.

01:08:01
That one is using DNS.

01:08:02
That one is using proxies.

01:08:03
And then the compute is also

01:08:05
somehow different.

01:08:06
And it was not reusable.

01:08:09
We were doing a lot

01:08:09
of copy, paste, modify,

01:08:12
something that was

01:08:13
not very sustainable.

01:08:14
At some point, we

01:08:14
started thinking,

01:08:15
is there a way in

01:08:16
which we can pack package,

01:08:18
this stack into a

01:08:20
single deployable unit

01:08:21
that we can take

01:08:22
essentially anywhere?

01:08:23
And the answer was Kubernetes.

01:08:26
Kubernetes provides

01:08:27
us this abstraction

01:08:29
where we can abstract away this

01:08:30
compute, this storage,

01:08:32
this bit working and code against

01:08:34
a programmable API

01:08:36
that we can indeed

01:08:38
create this package.

01:08:39
So that's a StackGres.

01:08:40
So StackGres is the

01:08:41
stack of components

01:08:43
you need to run

01:08:44
production Postgres,

01:08:45
packaging a way that is uniform

01:08:47
across any environment

01:08:47
where you want to run it, cloud,

01:08:49
on-prem, it doesn't matter.

01:08:51
And is production ready!

01:08:52
And is packaged in a

01:08:53
very, very high level.

01:08:54
So basically you

01:08:56
barely need, I would say,

01:08:57
you don't need

01:08:58
Postgres knowledge

01:08:59
to run a production ready

01:09:01
enterprise quality

01:09:02
Postgres cluster introduction.

01:09:04
And that's the main

01:09:05
goal of a StackGres.

01:09:07
Right, right.

01:09:08
And as far as I know,

01:09:10
I think it's implemented as a

01:09:11
Kubernetes operator, right?

01:09:14
Yes, exactly.

01:09:15
And there's quite a few other

01:09:17
operators as well.

01:09:19
But I know that StackGres has some

01:09:20
things which are,

01:09:22
well, which are done

01:09:22
slightly different.

01:09:23
Let's say it that way.

01:09:24
Or at least you're

01:09:26
looking into the future

01:09:27
trying to figure out some things

01:09:29
that others don't.

01:09:32
Can you talk a

01:09:33
little bit about that?

01:09:34
I don't know how much

01:09:35
you do wanna actually

01:09:37
make this public right now.

01:09:39
No, actually

01:09:41
everything is open source.

01:09:42
Our roadmap is open source, our

01:09:43
issues are open source.

01:09:44
I'm happy to share everything.

01:09:46
Well, first of all,

01:09:47
what I would say is that

01:09:49
the operator

01:09:49
pattern is essentially

01:09:51
these controllers

01:09:52
that take actions

01:09:53
on your cluster and the CRDs.

01:09:55
We gave a lot of

01:09:56
thought of these CRDs.

01:09:58
I would say that a

01:09:58
lot of operators,

01:10:00
CRDs are kind of a byproduct.

01:10:01
A second thought,

01:10:02
"I have my objects

01:10:03
and then some script

01:10:05
generates the CRDs."

01:10:06
No, we said CRDs are our

01:10:09
user-facing API.

01:10:10
The CRDs are our extended API.

01:10:13
And the goal of operators is to

01:10:15
abstract the way

01:10:17
and package business logic, right?

01:10:19
And expose it with a

01:10:20
simple user interface.

01:10:21
So we designed our CRDs to be

01:10:23
very, very high level,

01:10:24
very amenable to the user,

01:10:26
so that again, you don't require

01:10:27
any Postgres expertise.

01:10:29
So if you look at the CRDs, in

01:10:30
practical terms,

01:10:31
the YAMLs, right?

01:10:32
The YAMLs that

01:10:32
you write to deploy

01:10:34
something on StackGres,

01:10:36
they should be able that you could

01:10:37
only deploy, right?

01:10:38
You could explain to your

01:10:39
five-year-old kid

01:10:40
and your five-year-old kid should

01:10:41
be able to deploy Postgres

01:10:42
in production quality cluster, right?

01:10:45
And that's our goal.

01:10:46
And if we didn't

01:10:47
fulfill this goal,

01:10:48
please raise an issue on our

01:10:49
public issue tracker on GitLab

01:10:51
because we definitely have failed

01:10:52
if that's not true.

01:10:54
So instead of focusing on the

01:10:56
Postgres usual user,

01:10:59
very knowledgeable, very high level,

01:11:01
most operators

01:11:02
focused on low level CRDs

01:11:05
and they require Postgres

01:11:08
expertise, probably a lot.

01:11:10
We want to make Postgres more

01:11:12
mainstream than ever, right?

01:11:14
Postgres increases popularity every year

01:11:16
and it's being adopted by more and

01:11:17
more organizations,

01:11:18
but not everybody's a

01:11:19
Postgres expert.

01:11:19
We want to make Postgres

01:11:20
universally accessible

01:11:21
for everyone.

01:11:23
So one of the things is that we

01:11:24
put a lot of effort

01:11:25
into this design.

01:11:26
And we also have instead of like a

01:11:28
big one, gigantic CRD.

01:11:30
We have multiple.

01:11:31
They actually can be

01:11:33
attached like in a ER diagram

01:11:36
between them.

01:11:37
So you understand

01:11:37
relationships, you create one

01:11:39
and then you reference many times,

01:11:40
you didn't need to

01:11:41
restart or reconfigure

01:11:43
the configuration files.

01:11:46
Another area where I would say we

01:11:48
have tried to do something

01:11:49
is extensions.

01:11:50
Postgres extensions is

01:11:51
one of the most loved,

01:11:51
if not the most

01:11:52
loved feature, right?

01:11:54
And StackGres is the operator that

01:11:55
arguably supports

01:11:56
the largest number of extensions,

01:11:58
over 200 extensions

01:11:59
of now and growing.

01:12:00
And we did this because we

01:12:02
developed a custom solution,

01:12:04
which is also open

01:12:04
source by StackGres,

01:12:06
where we can load extensions

01:12:07
dynamically into the cluster.

01:12:09
So we don't need to

01:12:09
build you a fat container

01:12:10
with 200 images and a lot of

01:12:12
security issues, right?

01:12:14
But rather we deploy you a

01:12:16
container with no extensions.

01:12:19
And then you say, "I want this,

01:12:20
this, this and that."

01:12:20
And then they will appear in your

01:12:21
cluster automatically.

01:12:22
And this is done via simple YAML.

01:12:24
So we have very

01:12:26
powerful extension mechanism.

01:12:28
And the other thing is

01:12:29
that we not only expose

01:12:32
the usual CRD YAML interface for

01:12:35
interacting with StackGres,

01:12:37
it's more than fine and I love it,

01:12:38
but it comes with a

01:12:39
fully fledged web console.

01:12:41
Not everybody also likes command

01:12:43
line or GitOps approach.

01:12:45
We do, but not everybody does.

01:12:46
And it's a fully

01:12:47
fledged web console

01:12:48
which supports also

01:12:49
for single sign-on,

01:12:51
where you can

01:12:51
integrate with your AD,

01:12:53
with your OIDC provider,

01:12:55
anything that you want.

01:12:56
Has detailed

01:12:57
fine-grained permissions

01:12:58
based on Kubernetes RBAC.

01:12:59
So you can say, "Who

01:13:00
can create clusters,

01:13:01
who can view configurations, who

01:13:02
can do anything?"

01:13:04
And last but not

01:13:05
least, there's a REST API.

01:13:07
So if you prefer to

01:13:08
automate and integrate

01:13:08
with another kind of solution,

01:13:10
you can also use the

01:13:11
REST API and create clusters

01:13:12
and manage

01:13:13
clusters via the REST API.

01:13:14
And these three mechanisms, the

01:13:16
YAML files, CRDs,

01:13:17
the REST API and the web console

01:13:21
are fully interchangeable.

01:13:22
You can use one for one operation,

01:13:23
the other one for everything goes

01:13:25
back to the same.

01:13:26
So you can use any

01:13:28
one that you want.

01:13:30
And lately we also

01:13:31
have added sharding.

01:13:33
So sharding scale out with

01:13:35
solutions like Citus,

01:13:37
but we also support

01:13:37
foreign interoperables,

01:13:38
Postgres with partitioning and

01:13:40
Apache ShardingSphere.

01:13:42
Our way is to create like a

01:13:44
cluster of multiple instances.

01:13:46
Not only one

01:13:46
primary and one replica,

01:13:48
but a coordinator

01:13:49
layer and then shards,

01:13:50
and it shards a

01:13:51
coordinator of the replica.

01:13:52
So typically dozens of instances,

01:13:56
and you can create them with a

01:13:57
simple YAML file

01:13:59
and very high-level description,

01:14:00
requires some knowledge and wires

01:14:02
everything for you.

01:14:03
So it's very, very convenient to

01:14:04
make things simple.

01:14:06
Right, right.

01:14:07
So the plugin mechanism or the

01:14:10
extension mechanism,

01:14:11
that was exactly what I hinted at.

01:14:13
That was mind-blowing.

01:14:14
I've never seen anything like that

01:14:15
when you showed it last

01:14:16
year in Ibiza, I think.

01:14:19
Last autumn or whatever.

01:14:21
So that was really cool.

01:14:23
The other thing that is

01:14:25
always a little bit of

01:14:29
like a hat-scratcher, I

01:14:30
think, for a lot of people

01:14:31
when they hear that

01:14:32
a Kubernetes operator

01:14:34
is actually written in Java.

01:14:36
I think RedHat built

01:14:37
the original framework.

01:14:39
So it kind of makes sense

01:14:40
that RedHat is doing that,

01:14:42
I think the original

01:14:43
framework was a Go library.

01:14:47
And Java would probably not be

01:14:49
like the first choice

01:14:50
to do that.

01:14:51
So what was that?

01:14:53
How did that come along?

01:14:55
Well, at first

01:14:56
you're right.

01:14:57
Like the operator

01:14:58
framework is written in Go

01:15:00
and there was nothing

01:15:01
else than Go at the time.

01:15:02
So we were looking

01:15:03
at that, but our team,

01:15:04
we had a team of very, very senior

01:15:05
Java programmers

01:15:07
and none of them were Go

01:15:08
programmers, right?

01:15:10
But I've seen the

01:15:10
Postgres community

01:15:11
and all the

01:15:12
communities is that people

01:15:13
who are kind of more

01:15:14
in the DevOps world,

01:15:16
maybe working as

01:15:18
DBAs, DB Ops,

01:15:20
they switch to Go programmers.

01:15:22
It's kind of a bit

01:15:23
natural, but at the same time,

01:15:25
they are not senior from a Go

01:15:28
programming perspective, right?

01:15:29
The same would have happened with

01:15:30
our team, right?

01:15:31
They would switch from Java to Go.

01:15:33
They would have been senior in Go,

01:15:35
obviously, right?

01:15:36
Right.

01:15:37
So it would have taken some time

01:15:38
to develop those skills.

01:15:40
On the other hand, we looked at

01:15:41
what is the technology

01:15:42
behind, what is an operator?

01:15:43
And an operator is no more than

01:15:47
essentially an HTTP server

01:15:49
that receives

01:15:50
callbacks from Kubernetes

01:15:52
and a client because it

01:15:53
makes calls to Kubernetes.

01:15:55
And HTTP clients and

01:15:56
servers can read written

01:15:57
in any language.

01:15:59
So we look at the core, how

01:16:00
complicated this is

01:16:01
and how much does this operator

01:16:03
framework brings to you?

01:16:04
How we saw that it

01:16:05
was not that much.

01:16:06
And actually

01:16:07
something, for example,

01:16:09
just mentioned before, the CRDs

01:16:10
are kind of generated

01:16:11
from your structures

01:16:12
and we really wanted

01:16:13
to do the opposite way.

01:16:14
This is like the database.

01:16:15
You use an ORM to read your

01:16:17
database existing schema

01:16:18
that we develop with

01:16:19
all your SQL capabilities

01:16:21
or you just create an object and

01:16:22
let that generate database.

01:16:24
I prefer the format.

01:16:25
So we did the same

01:16:26
thing with the CRDs, right?

01:16:27
And we wanted to develop them.

01:16:28
So Java was more than okay to

01:16:31
develop a Kubernetes operator

01:16:33
and our team was expert in Java.

01:16:35
So by doing it in

01:16:36
Java, we were able

01:16:37
to be very efficient and

01:16:39
deliver a lot of value,

01:16:40
a lot of features very, very fast

01:16:42
without having to retrain anyone,

01:16:44
learn a new language,

01:16:45
or learn new skills.

01:16:47
On top of this,

01:16:47
there's sometimes a concern

01:16:49
that Java requires a JVM,

01:16:50
which is kind of a heavy

01:16:52
environment, right?

01:16:53
And consumes memory

01:16:53
and resources, and disk.

01:16:55
But by default, StackGres uses a

01:16:57
compilation technology

01:16:59
and will a whole project around it

01:17:00
called GraalVM.

01:17:01
And this allows to

01:17:02
generate native images

01:17:03
that are indistinguishable from

01:17:05
any other binary,

01:17:06
Linux binary you can

01:17:07
have with your system.

01:17:08
And we deploy

01:17:08
StackGres with native images.

01:17:11
You can also switch

01:17:12
JVM images if you prefer.

01:17:13
We over expose both, but by

01:17:15
default, there are native images.

01:17:16
So at the end of the day, StackGres

01:17:17
is several megabytes

01:17:20
file, Linux binary and the

01:17:22
container and that's it.

01:17:24
That makes sense.

01:17:25
And I like that you

01:17:26
basically pointed out

01:17:27
that the efficiency of the

01:17:29
existing developers

01:17:31
was much more

01:17:31
important than like being cool

01:17:34
and going from a new language

01:17:36
just because everyone does.

01:17:38
So we talked about the

01:17:40
operator quite a bit.

01:17:42
Like what are your general

01:17:43
thoughts on databases

01:17:44
in the cloud or

01:17:45
specifically in Kubernetes?

01:17:47
What are like the

01:17:48
issues you see, the problems

01:17:52
running a database in

01:17:53
such an environment?

01:17:56
Well, it's a wide topic, right?

01:17:58
And I think one of the most

01:18:01
interesting topics

01:18:01
that we're seeing

01:18:02
lately is a concern

01:18:04
about cost and performance.

01:18:08
So there's kind of a

01:18:10
trade off as usual, right?

01:18:11
There's a trade off

01:18:12
between the convenience

01:18:13
I want to run a database and

01:18:16
almost forget about it.

01:18:18
And that's why you switched to a

01:18:19
cloud managed service

01:18:21
which is not

01:18:22
always true by the way,

01:18:23
because forget about it means that

01:18:25
nobody's gonna then

01:18:26
back your database,

01:18:28
repack your tables, right?

01:18:31
Optimize your queries, analyze if

01:18:33
you haven't used indexes.

01:18:35
So if you're very

01:18:36
small, that's more than okay.

01:18:37
You can assume that you don't need

01:18:39
to touch your database

01:18:39
ever if you grow

01:18:40
over a certain level,

01:18:43
you're gonna need the

01:18:44
same DBAs, the same,

01:18:46
at least to operate not the basic

01:18:48
operations of the database

01:18:49
which are monitoring,

01:18:50
high availability and backups.

01:18:52
So those are the three main areas

01:18:53
that a managed

01:18:54
service provides to you.

01:18:55
But so there's convenience,

01:18:57
but then there's

01:18:58
an additional cost.

01:18:59
And this additional cost sometimes

01:19:01
is quite notable, right?

01:19:03
So it's typically

01:19:04
around 80% premium

01:19:06
on a N+1 divided by N

01:19:08
number of instances

01:19:10
because sometimes we need like an

01:19:10
extra even instance

01:19:11
for many cloud services, right?

01:19:13
And that multiply by 1.8 ends up

01:19:15
being two points

01:19:16
in the usual case.

01:19:17
So you're overpaying that.

01:19:19
So you need to analyze whether

01:19:20
this is good for you

01:19:22
from this

01:19:22
perspective of convenience

01:19:24
or you want to

01:19:25
have something else.

01:19:26
On the other hand,

01:19:27
almost all cloud services,

01:19:29
they use network disks.

01:19:32
And these network

01:19:33
disks are very good

01:19:34
and have improved performance a

01:19:35
lot in the last years,

01:19:37
but still they are

01:19:39
far from the performance

01:19:41
of a local drive, right?

01:19:43
And running

01:19:43
databases with local drives

01:19:45
has its own challenges, but they

01:19:46
can be addressed.

01:19:47
And you can really, really move

01:19:49
the needle by kind of,

01:19:52
I don't know if

01:19:52
that's the right term

01:19:53
to call it self-hosting,

01:19:54
but this trend of self-hosting,

01:19:57
and if we could

01:19:58
marry the simplicity

01:20:00
and the convenience of managed

01:20:02
services, right?

01:20:03
With the ability of

01:20:06
running on any environment

01:20:07
and running on any environment at

01:20:09
a much higher performance,

01:20:10
I think that's kind of an

01:20:12
interesting trend right now

01:20:13
and a good sweet spot.

01:20:15
And Kubernetes, to try

01:20:16
to marry all the terms

01:20:17
that you

01:20:18
mentioned in the question,

01:20:19
actually is one

01:20:20
driver towards this goal

01:20:22
because it enables us

01:20:24
infrastructure independence

01:20:26
and it enables both

01:20:26
network disks and local disks

01:20:29
and equally the same.

01:20:31
And it's kind of an

01:20:32
enabler for this pattern

01:20:33
that I see more trend,

01:20:35
more trend as of now,

01:20:36
more important and

01:20:37
one that definitely

01:20:38
we are looking forward.

01:20:40
Right, I like

01:20:41
that you pointed out

01:20:42
that there's ways to address the

01:20:44
local storage issues,

01:20:45
just shameless plug, we're

01:20:46
actually working on something.

01:20:49
I heard something.

01:20:51
I heard something.

01:20:52
Oh, you heard something.

01:20:54
I know, I know.

01:20:56
All right, last question

01:20:58
because we're also

01:20:59
running out of time.

01:21:00
What do you see is like the

01:21:02
biggest trend right now

01:21:03
in containers, cloud, whatever?

01:21:05
What do you think is

01:21:06
like the next big thing?

01:21:08
And don't say AI,

01:21:09
everyone says that.

01:21:10
Oh, no.

01:21:12
Well, you know what?

01:21:14
Let me do a

01:21:17
shameless plug here, right?

01:21:18
All right.

01:21:19
I did one.

01:21:20
(laughing)

01:21:21
So there's a technology we're

01:21:25
working on right now

01:21:27
that works for our use case,

01:21:28
but will work for

01:21:29
many use cases also,

01:21:31
which is what we're calling

01:21:32
dynamic containers.

01:21:34
So containers are essential as

01:21:36
something that are static,

01:21:37
right?

01:21:38
You build a

01:21:38
container, you have a build

01:21:39
with your Dockerfile,

01:21:40
whatever you use, right?

01:21:41
And then that image is static.

01:21:43
It is what it is.

01:21:44
Contains the layers that you

01:21:45
specified and that's all.

01:21:47
But if you look at any repository

01:21:49
in Docker Hub, right?

01:21:51
There's plenty of tags.

01:21:52
You have what, for

01:21:53
example, Postgres.

01:21:54
There's Postgres based on Debian.

01:21:56
There's Postgres based on Alpine.

01:21:58
There's Postgres with this option.

01:22:00
Then you want this extension,

01:22:02
then you want

01:22:02
this other extension.

01:22:03
And then there's a whole variety

01:22:05
of images, right?

01:22:07
And each of those images needs to

01:22:09
be built independently,

01:22:11
maintained, updated

01:22:12
independently, right?

01:22:14
But they're very orthogonal.

01:22:15
Like upgrading the Debian base OS

01:22:18
has nothing to do

01:22:19
with the Postgres layer,

01:22:20
has nothing to do with the

01:22:21
timescale extension,

01:22:22
has nothing to do with whether I

01:22:24
want the debug symbols or not.

01:22:26
So we're working on technology

01:22:28
with the goal of being able to,

01:22:30
as a user, express

01:22:32
any combination of items

01:22:34
I want for my container and get

01:22:36
that container image

01:22:37
without having to rebuild and

01:22:38
maintain the image

01:22:40
with that specific

01:22:41
parameters that I want.

01:22:43
Right, and let me guess,

01:22:44
that is how the

01:22:45
Postgres extension stuff works.

01:22:47
It is meant to be,

01:22:48
and then as a solution

01:22:50
for the Postgres extensions,

01:22:51
but it's actually quite broad and

01:22:53
quite general, right?

01:22:54
Like, for example, I was

01:22:55
discussing recently

01:22:56
with some folks of the

01:22:57
OpenTelemetry community,

01:22:58
and the OpenTelemetry collector,

01:23:01
which is the router for signals

01:23:03
in the OpenTelemetry world, right?

01:23:05
Has the same architecture,

01:23:07
has like around

01:23:07
200 plugins, right?

01:23:09
And you don't want

01:23:09
a container image

01:23:10
with those 200 plugins,

01:23:12
which potentially,

01:23:13
because many third parties

01:23:14
may have some security

01:23:15
vulnerabilities,

01:23:16
or even if there's an update,

01:23:17
you don't want to update all those

01:23:18
and restart your containers and

01:23:19
all that, right?

01:23:20
So why don't kind of get a

01:23:21
container image with,

01:23:23
the OpenTelemetry

01:23:24
collector with this source

01:23:25
and this receiver

01:23:26
and this export, right?

01:23:29
So that's actually

01:23:30
probably more applicable.

01:23:32
Yeah, I think that

01:23:33
makes sense, right?

01:23:34
I think that is a really good end,

01:23:38
especially because the static

01:23:40
containers in the past

01:23:41
were in the original idea was that

01:23:43
the static gives you some kind of

01:23:45
like consistency

01:23:47
and some security on how the

01:23:48
container looks like,

01:23:49
but we figured out that over time,

01:23:51
that is not the best solution.

01:23:53
So I'm really looking

01:23:53
forward to that being

01:23:55
probably a more general thing.

01:23:57
To be honest, actually the idea,

01:23:59
I call it dynamic containers,

01:24:01
but in reality, from

01:24:02
a user perspective,

01:24:03
they're the same static as before.

01:24:04
They are dynamic from

01:24:06
the registry perspective.

01:24:08
Right, okay, fair enough.

01:24:10
All right, thank you very much.

01:24:14
It was a pleasure like

01:24:15
always talking to you.

01:24:17
And for the other

01:24:19
ones, I see you next week

01:24:21
or hopefully you hear me next week

01:24:23
with my next guest.

01:24:25
And thank you to Álvaro,

01:24:27
thank you for being here.

01:24:28
It was appreciated like always.

01:24:30
Thank you very much.

01:24:31
The cloud commute podcast is sponsored by

01:24:33
simplyblock your own elastic

01:24:35
block storage engine for the cloud.

01:24:37
Get higher IOPS and low predictable

01:24:39
latency while bringing down your

01:24:40
total cost of ownership.

01:24:42
www.simplyblock.io