Jimmy Angelakos, a freelance consultant, talks about his experiences with customers running PostgreSQL on bare-metal, in the cloud, and on Kubernetes. He also talks about his new book "PostgreSQL Mistakes and How to Avoid Them" (discount link below).
If you have questions to Jimmy, you can reach him here:
- LinkedIn: https://www.linkedin.com/in/vyruss/
- X/Twitter: https://twitter.com/vyruss
- Mastodon: https://fosstodon.org/@vyruss
Additional show notes:
- PostgreSQL Mistakes and How to Avoid Them: http://mng.bz/NVzn
- PostgreSQL 16 Administration Cookbook: https://packt.link/DBjEW
- Simon Riggs' Keynote "The Next 20 Years": https://www.youtube.com/watch?v=8W-J36IxYv4
The Cloud Commute Podcast is presented by simplyblock (https://www.simplyblock.io)
01:00:00
That has anti-patterns that are
01:00:03
not really restricted to Postgres.
01:00:05
It's things like, don't improvise,
01:00:09
don't create your own
01:00:10
distributed systems.
01:00:12
There's people that have spent
01:00:14
hundreds of thousands of hours
01:00:17
working on these problems, and you
01:00:19
don't need to reinvent the wheel.
01:00:25
You're listening to simplyblock's Cloud Commute Podcast,
01:00:27
your weekly 20 minute
01:00:28
podcast about cloud technologies,
01:00:30
Kubernetes, security,
01:00:32
sustainability, and more.
01:00:34
Welcome back everyone.
01:00:35
Welcome to the next episode of
01:00:37
simplyblock's
01:00:37
Cloud Commute podcast.
01:00:39
Today I have a very
01:00:41
interesting guest,
01:00:41
very different from
01:00:42
the other ones before,
01:00:45
because he's actually an author,
01:00:47
writing a book right now.
01:00:50
Well, I think he already published
01:00:51
one or two at least.
01:00:54
But he'll going to
01:00:55
talk about that himself.
01:00:57
Welcome, Jimmy.
01:00:59
Hi, very nice to be here.
01:01:02
Very nice. Thank
01:01:04
you for being here.
01:01:06
Maybe we just start simple with
01:01:09
the basic stuff.
01:01:10
Who are you?
01:01:11
Where are you from?
01:01:12
What do you do for a
01:01:14
living? Except writing a book.
01:01:18
My name is Jimmy Angelakos,
01:01:21
which is obviously a Greek name.
01:01:24
I live in Edinburgh in Scotland.
01:01:28
I've been working with Postgres
01:01:30
for maybe around 16
01:01:33
years now, exclusively.
01:01:35
I haven't used any other database
01:01:36
in 16 years in a
01:01:38
professional capacity.
01:01:41
Naturally, the time came to share
01:01:46
my experiences and I wrote a
01:01:47
couple of books on this.
01:01:48
Well, I actually co-wrote the
01:01:53
PostgreSQL16
01:01:54
administration cookbook
01:01:56
with my lovely
01:01:58
co-authors Boris Mejías,
01:02:02
Gianni Ciolli, Vibhor Kumar,
01:02:05
and the sadly
01:02:07
departed Simon Riggs,
01:02:11
who was an awesome fellow.
01:02:14
I'd like to pay a little tribute
01:02:17
to him as a person,
01:02:20
as a mentor to the
01:02:21
entire Postgres community.
01:02:23
He will be greatly missed.
01:02:28
Thank you very much.
01:02:29
I appreciate you sharing that
01:02:31
because I think it was last week
01:02:34
at the time of recording.
01:02:37
It is a sad story for the Postgres
01:02:41
community as a whole.
01:02:42
Thank you for sharing that.
01:02:46
From your professional life,
01:02:47
for the last couple of years next
01:02:49
to writing books,
01:02:50
I think you're mostly working as a
01:02:51
consultant with a couple of
01:02:53
different companies and customers.
01:02:56
What do you think is
01:02:57
the most common task?
01:02:59
I mean, you're probably coming in
01:03:01
helping them optimize Postgres,
01:03:04
optimize queries.
01:03:05
Right. I've done all sorts of
01:03:07
things in the past few years,
01:03:08
like training customers to use
01:03:12
Postgres in general,
01:03:14
training them to use Postgres in a
01:03:16
specific way that is
01:03:18
suited to their needs.
01:03:21
I have provided support to
01:03:25
customers who ran Postgres,
01:03:27
and also professional services
01:03:28
like consulting.
01:03:31
I can't really say what the thing
01:03:33
they use the most is or they
01:03:36
request the most,
01:03:37
but I can tell you
01:03:40
a few of the things.
01:03:42
Some customers come in and say,
01:03:44
"My queries aren't
01:03:45
running well. What can I do?"
01:03:47
It's like the most
01:03:49
frequent thing you hear.
01:03:52
Some other people say,
01:03:54
"Tell me what hardware
01:03:56
to buy for Postgres."
01:03:57
You tell them, "I can't really
01:04:00
give you a response because it
01:04:02
really depends on your workload,"
01:04:04
which is the most important
01:04:06
factor, I think, with databases.
01:04:08
Everyone uses them differently.
01:04:09
If it's a database that is widely
01:04:12
used as Postgres
01:04:13
with so many use cases
01:04:15
and so many
01:04:16
different ways to use it,
01:04:19
like you can do analytics on it.
01:04:21
To an extent, you can use it for
01:04:23
transaction processing,
01:04:25
(OLTP), you can use it as a
01:04:28
document database,
01:04:32
with JSONB. There's all sorts of
01:04:33
things you can do.
01:04:34
There's no good answer to the
01:04:36
things that people ask like,
01:04:39
"Give me the best tuning
01:04:40
parameters for Postgres,"
01:04:42
or "How to write a
01:04:45
query the right way."
01:04:46
It really depends on the
01:04:48
amount of data you have,
01:04:50
the type of data you have,
01:04:51
and the sort of queries you're
01:04:52
going to be running.
01:04:54
Yeah, that makes a lot of sense.
01:04:56
It's not only for the Postgres
01:04:58
community or for Postgres.
01:05:00
That is very true
01:05:01
for a lot of things.
01:05:04
From my own personal background,
01:05:07
with a lot of programming
01:05:07
languages or runtime environments,
01:05:09
people ask, "What is the
01:05:10
optimized or the optimal way of
01:05:13
configuring it?"
01:05:14
And they're like, "I don't know.
01:05:16
Can't give you the answer."
01:05:18
So, yeah, I hear
01:05:19
where you're coming from.
01:05:22
All right, so...
01:05:24
Sorry, I'm still having
01:05:25
a little bit of a flu.
01:05:28
So, from your personal background,
01:05:33
you said you've
01:05:34
co-written one book,
01:05:35
but I also hinted on the fact that
01:05:38
you're writing
01:05:39
another book right now,
01:05:40
and I looked a little bit into it
01:05:43
because it's on Manning
01:05:44
and it has Early
01:05:45
Access, which is nice.
01:05:46
But maybe you can give us a little
01:05:48
bit of an insight
01:05:49
of what you're writing about.
01:05:51
Right. So, the book
01:05:54
that is under construction
01:05:56
is called PostgreSQL Mistakes
01:05:59
and how you can avoid them.
01:06:02
So, it's a bit of an anti-how-to.
01:06:05
So, for people that
01:06:06
are used to how-to books,
01:06:08
like, "How do I partition? How do
01:06:10
I do this? How do I do that?"
01:06:12
It's a bit of the
01:06:13
other way around.
01:06:14
I was trying to do this, but
01:06:17
things went wrong.
01:06:19
So, it's experiences
01:06:20
that I've collected
01:06:22
from the things I've
01:06:23
seen our customers do
01:06:26
or the things I've
01:06:27
done in the past.
01:06:29
Right.
01:06:30
And it's really
01:06:31
important to learn from mistakes.
01:06:35
Everyone makes mistakes.
01:06:37
And Postgres is very particular in
01:06:39
how it wants things done.
01:06:43
So, if you get it right, the
01:06:46
database is fantastic.
01:06:48
It works very well with
01:06:50
excellent performance.
01:06:51
And when you start to do
01:06:56
things a different way,
01:06:57
you can see different results.
01:07:01
And that's
01:07:01
basically the whole idea.
01:07:03
There's three chapters.
01:07:05
Three chapters up on the web now.
01:07:08
And there's a huge fourth chapter
01:07:10
that's being
01:07:11
published as we speak.
01:07:16
That has anti-patterns that are
01:07:19
not really restricted to Postgres.
01:07:21
It's things like, don't improvise,
01:07:25
don't create your own
01:07:26
distributed systems.
01:07:28
There's people that have spent
01:07:30
hundreds of thousands of hours
01:07:33
working on these problems, and you
01:07:34
don't need to reinvent the wheel.
01:07:38
I hear you.
01:07:38
As you said, there's three
01:07:40
chapters out right now.
01:07:42
I haven't seen the fourth one yet,
01:07:43
so I think I have to look into
01:07:45
that right after the recording.
01:07:47
Manning are in the process of
01:07:49
publishing it as we speak.
01:07:49
All right, cool.
01:07:51
But so far, I really
01:07:53
like the second chapter
01:07:56
and you bringing up all
01:07:57
of the SQL code examples
01:08:00
and showing the execution plans.
01:08:02
And I think just by saying the
01:08:04
word execution plan
01:08:05
or the term execution plan,
01:08:06
I probably lost half of the
01:08:08
audience right now.
01:08:09
So maybe you can give them a
01:08:10
little bit of a feeling of what is
01:08:12
an execution plan?
01:08:13
Why is it so important to
01:08:14
understand those things?
01:08:15
Yeah, so Postgres has a
01:08:17
quasi-intelligent query planner,
01:08:22
which basically examines the way
01:08:26
your query is written
01:08:28
and produces a plan on how it's
01:08:31
going to get executed by the
01:08:33
database server.
01:08:34
It's like, oh, they wrote this,
01:08:37
where, this, and that,
01:08:40
and it looks like a join.
01:08:42
So I'm going to perform
01:08:43
a join of these tables
01:08:45
and then I'm going to order the
01:08:47
results in this.
01:08:48
So that's the execution plan.
01:08:50
It's basically telling you how the
01:08:51
database is going to
01:08:52
execute your SQL query.
01:08:56
Now, the planner takes into
01:08:58
account things such as how much
01:09:00
memory do you have
01:09:01
or how fast are your disks that
01:09:04
you've already specified in the
01:09:06
Postgres configuration.
01:09:07
It also takes into account things
01:09:09
like what's the
01:09:10
nature of the data?
01:09:12
What's the cardinality,
01:09:13
let's say, in your tables?
01:09:16
And these are things that are
01:09:17
updated
01:09:19
automatically by Postgres itself
01:09:21
in its statistics tables.
01:09:23
So it produces, most of the time,
01:09:25
a really good plan.
01:09:28
And what is a good plan?
01:09:29
It's the cheapest plan in
01:09:31
terms of arbitrary cost.
01:09:34
And arbitrary cost is calculated
01:09:37
using those factors
01:09:39
that I just mentioned.
01:09:41
And it iterates through many plans
01:09:45
for the execution,
01:09:47
chooses the cheapest one, which
01:09:48
will probably end up
01:09:49
being the fastest one
01:09:51
to execute in real world terms.
01:09:55
And seeing the execution plans is
01:09:58
key to understand why your queries
01:10:00
are running well
01:10:01
or why they're running slowly.
01:10:04
Because then you can see, ah, this
01:10:06
is what Postgres was trying to do.
01:10:08
So maybe I should force its hand
01:10:10
by writing this
01:10:11
slightly differently.
01:10:14
Yeah, that's true.
01:10:16
I think my personal favorite
01:10:18
example is a
01:10:20
common table expression,
01:10:22
which ends up being a join because
01:10:24
the query planner understands now
01:10:25
a join is actually better.
01:10:26
I don't need to do the temporary
01:10:29
heap table to store the
01:10:32
intermediate result.
01:10:35
Cool, yeah.
01:10:36
So we kind of hinted where people
01:10:38
can find the early access version.
01:10:41
It's at Manning.
01:10:42
Do you want to add
01:10:43
anything more to that?
01:10:46
Maybe have a specific simple URL
01:10:50
or something where
01:10:51
people can find it.
01:10:54
I can share the URL, but I
01:10:55
certainly cannot spell it out.
01:10:58
OK, that's fair enough.
01:10:59
We're going to put
01:10:59
it in the show notes.
01:11:01
That's totally fine.
01:11:02
Thanks very much.
01:11:04
Yeah, I think it's going to be an
01:11:05
interesting book because it's real
01:11:08
world use cases.
01:11:10
And where it isn't a real world
01:11:13
use case, it's close enough.
01:11:16
And I will tell
01:11:17
you so in the text.
01:11:19
That is true.
01:11:20
And I agree.
01:11:20
As I said, I've well kind of read
01:11:23
through the first three.
01:11:25
I read as much as I had time, but
01:11:28
I really enjoyed it.
01:11:30
And many of those code examples
01:11:33
you brought up, as I said,
01:11:34
especially in the second chapter,
01:11:36
they were like, yes, either I've
01:11:38
been there or I had people helping
01:11:40
with that as well.
01:11:42
I've worked for a Postgres based
01:11:45
startup in the past.
01:11:47
And we had people asking pretty
01:11:49
much the same
01:11:50
questions over and over again.
01:11:52
So yes, for everyone using
01:11:54
Postgres or starting using
01:11:55
Postgres, it's probably a pretty,
01:11:57
pretty good pick.
01:11:59
Thank you.
01:11:59
I appreciate that.
01:12:01
Yeah, as you know, people are
01:12:03
familiar with other databases
01:12:05
because Postgres has most recently
01:12:10
exploded in popularity.
01:12:12
It was kind of a niche database
01:12:13
for a few years.
01:12:15
And now it looks like all the
01:12:17
enterprises are using it, all the
01:12:19
hyperscalers are starting to use
01:12:21
it, like AWS, Google, Azure.
01:12:27
This means that they have
01:12:29
recognized the value that Postgres
01:12:32
brings to the table.
01:12:34
Yeah, I agree.
01:12:35
And I think it's kind of
01:12:36
interesting because you kind of
01:12:37
hinted at that earlier.
01:12:39
But you can do a lot
01:12:40
of things with Postgres.
01:12:42
There is a lot of stuff in
01:12:44
Postgres itself.
01:12:45
If you want document database, you
01:12:47
have XML and JSON.
01:12:50
If you want key
01:12:51
value, you have hstore.
01:12:53
But there is also a really good
01:12:56
extensibility to Postgres, giving
01:12:58
you the chance to
01:12:59
plug everything else in,
01:13:00
like time series, graph databases.
01:13:03
I don't know what else.
01:13:04
You probably could define Postgres
01:13:08
as the actual really only in the
01:13:10
world multimodal database.
01:13:13
Right, yeah.
01:13:14
And we were actually considering
01:13:15
of changing the description of
01:13:17
Postgres on the website,
01:13:18
where you go in and it says it's
01:13:20
an object relational database,
01:13:22
which is kind of a formal,
01:13:23
traditional way to put it.
01:13:26
But nowadays, you're right.
01:13:28
I think it's more of
01:13:29
a multimodal database.
01:13:31
And I think that is also the term
01:13:33
that Simon Riggs preferred.
01:13:38
Because it does all of these
01:13:40
things and can also let you do
01:13:42
things that the developers of
01:13:45
Postgres hadn't even thought of,
01:13:47
because of the extension system.
01:13:49
Like a very famous extension is
01:13:51
PostGIS, which is the GIS (geospatial)
01:13:55
capabilities for Postgres.
01:13:59
And is now considered the gold
01:14:01
standard in
01:14:02
geographical databases.
01:14:04
True.
01:14:05
From an open source extension to
01:14:07
an open source database.
01:14:09
And there's like thousands of
01:14:11
people that are professionally
01:14:14
employed to use this extension in
01:14:17
their day jobs, which is amazing.
01:14:20
True.
01:14:21
I agree.
01:14:23
So let me see.
01:14:25
Let me flip back a little bit.
01:14:27
I mean, we're
01:14:27
officially a cloud podcast.
01:14:29
We talked a lot about
01:14:30
the cool Postgres world.
01:14:32
And I was part of
01:14:33
a Postgres world.
01:14:34
I was part of the Java world.
01:14:35
So that is mostly the
01:14:37
guests I had so far.
01:14:39
But because we're a cloud podcast,
01:14:41
what do you think, like working
01:14:43
with all the different customers,
01:14:46
what is your feeling?
01:14:47
Like how many people are actually
01:14:48
deploying Postgres in the cloud,
01:14:51
in Kubernetes, in EC2,
01:14:54
or anything like that?
01:14:58
Well, the company I'm working with
01:15:00
right now are using it on RDS.
01:15:07
They're using RDS Postgres because
01:15:10
it suits their use
01:15:12
case better in the sense
01:15:13
that they don't have a team that
01:15:17
wants to worry about replication
01:15:20
and backups and things like that.
01:15:23
And availability zones, they want
01:15:25
that handled as a service.
01:15:29
And that fits their
01:15:31
use case quite well.
01:15:32
When you want more flexibility,
01:15:34
you can still use the cloud.
01:15:37
You can run, for example, Postgres
01:15:40
on Azure boxes or EC2 boxes or
01:15:43
whatever you want.
01:15:45
But then you have to take care of
01:15:47
these things yourself.
01:15:51
Right.
01:15:52
But it still gives you freedom
01:15:55
from having to worry about hard
01:15:57
drives and hardware and purchase
01:16:01
orders and things like that.
01:16:03
You just send off a check every
01:16:05
month and you're done.
01:16:09
Now, Kubernetes is
01:16:11
an interesting case.
01:16:13
There's a couple of
01:16:14
operators for Postgres.
01:16:16
The most recent one is Cloud
01:16:18
Native PG, which is starting to
01:16:24
get supported and getting traction
01:16:26
from the Cloud Native Computing
01:16:28
Foundation, which is great.
01:16:32
And they are trying to do things
01:16:36
in a different way that
01:16:37
is totally cloud native.
01:16:40
So everything is defined as a
01:16:42
resource in Kubernetes.
01:16:45
But the resources map to things
01:16:48
that are well known in Postgres,
01:16:50
like clusters and nodes and
01:16:52
backups and actual things so that
01:16:56
you don't have to perform black
01:17:00
magic like running it in a pod,
01:17:02
but also having to configure the
01:17:05
pod manually to talk to another
01:17:07
pod that is your
01:17:08
replicant, things like that.
01:17:11
And there are other operators that
01:17:13
have evolved over time to
01:17:15
approximate this ease of use.
01:17:21
I think the Crunchy Data
01:17:23
Operator comes to mind.
01:17:25
It started off
01:17:25
being very imperative.
01:17:31
They had a command line utility
01:17:33
that created clusters and so on.
01:17:35
And now they've turned it into a
01:17:36
declarative, which is more cloud
01:17:39
native, more preferred by the
01:17:41
Kubernetes world.
01:17:44
I think these two are the major
01:17:46
Postgres things that I've seen in
01:17:50
Kubernetes, at least that I've
01:17:54
seen in use the past few years.
01:17:58
There are still things that
01:18:00
haven't been sorted because, as we
01:18:02
said, Postgres is super flexible.
01:18:05
And this flexibility and the ease
01:18:08
of use of Kubernetes, where
01:18:11
everything is taken care of
01:18:12
automatically, comes at a cost.
01:18:14
You have reduced flexibility when
01:18:15
you're on Kubernetes.
01:18:19
So there's things that haven't
01:18:21
been totally worked out yet, like
01:18:23
how do you one click migrate from
01:18:26
a cluster that is outside
01:18:28
Kubernetes to something that is
01:18:30
running in Kubernetes?
01:18:32
Or can you take a backup that was
01:18:34
produced elsewhere and create a
01:18:35
cluster in Kubernetes, a Postgres
01:18:37
cluster from that backup?
01:18:40
Now, once they have these things
01:18:41
sorted and also hardware support
01:18:47
is very important when you're
01:18:50
talking to databases,
01:18:51
I think we'll see many more people
01:18:52
going to Postgres on
01:18:55
Kubernetes in production.
01:18:57
But specifically hardware and
01:19:01
specifically disk performance and
01:19:04
throughput and latency,
01:19:08
you have to get into the hardware
01:19:09
nitty gritty of Kubernetes to take
01:19:13
maximum advantage of Postgres
01:19:14
because as a
01:19:16
database, it loves fast disks.
01:19:19
Generally speaking, the faster
01:19:21
your disk, the
01:19:22
faster Postgres will go.
01:19:25
That is true. And just like a
01:19:28
shameless plug, we're
01:19:29
working on something.
01:19:32
But because we're running out of
01:19:34
time already, 20 minutes
01:19:35
is always so super short.
01:19:38
What do you think is going to be the next
01:19:40
thing for Postgres, the database
01:19:42
world, the cloud
01:19:43
world, whatever you like.
01:19:45
What do you think
01:19:45
is the next thing?
01:19:47
I can't give you an answer, but
01:19:48
you can go search on YouTube and
01:19:53
you can find the lovely Simon's
01:19:56
last contribution to
01:19:58
the Postgres community.
01:20:00
He gave a talk at PostgreSQL
01:20:02
Conference Europe last December
01:20:04
where he said "Postgres, the next 20
01:20:08
years" or something to that effect.
01:20:10
And he predicted things, how
01:20:13
things will go for Postgres in the
01:20:15
future and future directions.
01:20:17
That's a very interesting talk for
01:20:19
anyone who wants to watch that.
01:20:21
I wouldn't want to hazard a guess
01:20:23
because I've seen people just
01:20:27
blindly accept the thing that AI
01:20:30
is the next big thing.
01:20:32
And everything in Postgres and
01:20:33
databases and Java and Python is
01:20:36
going to revolve
01:20:36
around AI in the future.
01:20:39
That remains to be seen.
01:20:41
I like that because normally I
01:20:43
start to say, please don't say AI.
01:20:46
Everyone says that.
01:20:48
And I think AI will be a big part
01:20:52
of the future, but
01:20:54
I agree with you.
01:20:55
It remains to be seen how exactly.
01:20:59
Yeah, thank you very much.
01:21:00
We're going to put the video link
01:21:02
in the show notes as well for
01:21:04
everyone interested.
01:21:06
And yeah, Jimmy,
01:21:09
thank you very much.
01:21:10
It was a pleasure having you.
01:21:12
Thanks very much. I
01:21:13
appreciate the invitation.
01:21:15
My pleasure.
01:21:16
And for the audience, we're going
01:21:19
to see or hear us next week.
01:21:22
And thank you very
01:21:23
much for being here.
01:21:25
The cloud commute podcast is sponsored by
01:21:27
simplyblock your own elastic
01:21:28
block storage engine for the cloud.
01:21:30
Get higher IOPS and low predictable
01:21:32
latency while bringing down your
01:21:33
total cost of ownership.
01:21:35
www.simplyblock.io

