Streaming Databases are not the same as a Streaming Processor
Cloud CommuteOctober 18, 2024x
34
00:35:3532.59 MB

Streaming Databases are not the same as a Streaming Processor

In this episode of Simplyblock's Cloud Commute Podcast, Yingjun Wu, founder of RisingWave Labs, discusses stream processing and how RisingWave enables real-time data analytics. He explains the differences between batch and streaming databases, RisingWave’s PostgreSQL compatibility, and its open-source focus.

In this episode of Cloud Commute, Chris and Yingjun discuss:

  • What is stream processing and how does it differ from batch processing?
  • Why RisingWave is a better fit for real-time data analytics.
  • How PostgreSQL compatibility enhances RisingWave’s functionality.
  • What open standards like Postgres, Kafka, and Iceberg mean for the future of data infrastructure.

Interested to learn more about the cloud infrastructure stack like storage, security, and Kubernetes? Head to our website (https://podcasts.simplyblock.io/podcast/cloud-commute) for more episodes, and follow us on LinkedIn (www.linkedin.com/company/simplyblock-io/mycompany/). You can also check out the detailed show notes on Youtube (https://youtu.be/tJIoKWqInu4).

You can find Yingjun Wu (Founder & CEO at RisingWave Labs) on Linkedin: https://www.linkedin.com/in/yingjun-wu/

About simplyblock:

Simplyblock is an intelligent database storage orchestrator for IO-intensive workloads in Kubernetes, including databases and analytics solutions. It uses smart NVMe caching to speed up read I/O latency and queries. Single system connects local NVMe disks, GP3 volumes, and S3 making it easier to handle storage capacity and performance. With the benefits of thin provisioning, storage tiering, and volume pooling, your database workloads get better performance at lower cost without changes to existing AWS infrastructure.

👉 Get started with simplyblock: https://www.simplyblock.io/buy-now

🏪 simplyblock AWS Marketplace: https://aws.amazon.com/marketplace/seller-profile?id=seller-fzdtuccq3edzm

Chris Engelbert: I think that is one of the big differences to RisingWave where you really just feed in data and you get this like life updating of the actual data table, basically. Yingjun Wu: I think the simplest way to explain a streaming database is that we can just consider it like a standard database system. But tailored for stream processing. It's that static. It's not that dynamic. If you change the base table, if you insert new tuples into the base table, the materiality won't change. And that's Postgres material issue. Don't just focus on AI. Think about data infras. Because, well, I mean, I talk to a lot of people, and people say, okay, why are you still working on data infras, right? Data infra is that, like, look at a Snowflake, Spark and it's dropping. It's dropping every single day. Chris Engelbert: Hello everyone. Welcome back to this week's episode of Simplyblock's Cloud Commute Podcast. This week, again, an incredible guest. Like, you, you know, the spiel, I say this every time and it's true. So I, I have to repeat it. All right. So with me today I actually have, oh, I didn't ask for how to pronounce your name specifically, but I think it's JingYun. It's pretty good. Yeah, pretty good. Probably close enough or as close as I can get from, from RisingWave. Jingwu no, no, Jing, Jingyong, sorry. Yingjun Wu: Let me do the intro. You Chris Engelbert: okay? Yeah. Just, go ahead and introduce yourself. Yingjun Wu: Yeah. Yeah. People are always, well, I mean, a lot of people ask me, okay. Every time I do interview, I mean, they will ask me about how do you pronounce your name? Okay. Anyways, well, I mean if you're not familiar with, well, how do you pronounce well, Asian names? Well, my name is Yingjun Wu and Yingjun, yeah. And yeah, I'm the founder of RisingWave Labs. So RidingWave is a stream processing company. And so a lot of people ask me about, okay, why are you working on stream processing? Right. And it's like, okay, probably it's not like, okay Postgres, right. So it's not like, yeah like it's like, well, Redshift or Snowflake, this kind of thing. Right. So basically a little bit intro about my background. Well, I did my PhD in stream processing and database systems almost 10 years ago. And at that time yeah, there was no stream presence systems. Well, I mean, there was no systems called, well, Spark Streaming, Flink, Samza, there's no such kind of systems. It was just like single node database system called, well, they're probably in Microsoft, there was Microsoft Streaming Insights, and there are some research projects. And after I grad yeah, obtained my PhD. I joined IBM research center, Amad Research Center in South Southern Jose, and doing research in streaming stream processing and transaction processing, basically transition database. And what I was working on was called a project called IBM DB two Event Install. So basically event install. So it's, yeah, social name. Basically the storage for storing. A database for storing event data, right? And then afterwards, well, I joined, well AWS Redshift was like, okay, this is a warehouse, right? For storing data. And I was working on seeing cool, very cool things, well such as, well vectorized scans, vectorized executions decoupled compute and storage architecture and materialized views. And I found that, okay Redshift is great technology, but I mean, it's for batch data, right? But what I can see is that, well, I mean, obviously, I mean, more and more people are using streaming data. They have the streaming stream, streaming data, and they really want to have the, I mean, the stream processing capability so, that they can process, they can gain the insights directly from the streaming data. And that's why I started copy. Call it RisingWave because, well, I mean, it's a wave, right? So it's water and a stream, right? It's in so I started a company and I'm working on the open source technology because I really believe in the open source. Chris Engelbert: Oh, that is, now the name makes sense. I always wondered how you came up with RisingWave. Oh, that makes sense. Sure. I mean. Yeah. Okay. The second it clicks, like, okay. Yep. So maybe, give us a quick introduction into RisingWave. You already said it's a stream engine or a stream processing engine. What are like the typical use cases? That you, that you see. Yingjun Wu: Yeah. I mean Reading Wave officially is a streaming database, but it's always people always wonder, okay, what is a streaming database, right? People probably heard of operational database or RTP database or analytical databases, right? Or lab databases. But what is a streaming database? I think a simple way to explain a streaming database is that we can just consider it like a standard database system tailored for stream processing. The other way to think about it is that we can process streaming data in a Postgres way, or in a database way. Like you can just, I mean, create a table, create a material to use doing queries there, right? Well, that's a streaming database. So, if you talk about the use cases, In most cases, it's about, well, I mean, the real time processing, real time data transformation, real time data analytics. So I can give a couple of few examples. So let's say that's where we, if we want to do stock trading and the re we really want to know, okay, what happened over the last of probably five minutes or probably continuous monitoring. Okay. What's happened over the last five minutes or probably last 10 minutes. Right. And in this case, well, it's not.. Yeah. Convenient to use the OLAP database or database houses to do such kind of things because, well, I mean, those systems are basically batch based systems and they. They are better optimized. They are highly optimized for processing large scale data, right? For probably a day's data, right? For yesterday's data. But if you want to use such a database to process, okay, continuously processing what happened over the last five minutes, or probably, or even last 10 seconds, they are actually not a very good fit. So that's for that's where essentially RisingWave can be a pretty good fit. Chris Engelbert: And in this case, yeah, okay, in this case, I think people would normally, when we look at Postgres, for example, they would probably go for a materialized view. But the problem is that a materialized view is an all or nothing refresh. Which means you have to recalculate everything. And I think that is one of the big differences to RisingWave where you really just feed in data and you get this like life updating of the actual data table, basically. Yingjun Wu: Whenever people talk about the materialist views in Postgres, specifically in Postgres, okay. And people were, so I saw in in Reddit a few days ago, that's what people say. That's for when you need to have Postgres materialist views. People say that, or a comment says that, okay, if you want to have if you want to have speedy results, instead of timely results, you should use matricious use. Because I mean, in Postgres, you, whenever you create a material review, you get the results and the results will not be directly updated. Will not be automatically updated. So you actually have to input common cause, refresh material, right? So that will, the Postgres will recomp confuse the material for you. And you can essentially store the, so the easiest way to think about a PostgreSQL mature SUU, that's why, I mean, you can just, I mean, store the results, store the computation results in a table. It's that static. It's not that dynamic. If you change the base table, if you insert new tuples into the base table, the mature SUU won't change. And that's Postgres materialized view. But in, in our case, we're in RisingWaves, materialized views, and it's different, where it's like, okay, every time you the new data comes in, right? Well, coming from Kafka, or you just insert a tuple into your Postgres or into whatever, whatever your upstream system is. We actually capture the changes and refresh the materials you use for you so that you can always see the fresh results, fresh and consistent results. That's what we do. The way Chris Engelbert: I love to talk about Postgres materialized views in the past, it's like you have your own personal secretary and she's Shreddering the report from last day, like all together before she starts in the morning to do it refresh. That's about what it is. You're basically just thrown everything away and you start up from, the bottom. As you know, I've, been with timescale before, and that was exactly one of the reasons why we also said, okay we need to have a better solution with time series with terabytes of data. You cannot do this. And, and personally, I think and we talked about that, RisingWave was, would have still been a step up from the continuous aggregates that I actually played around with that. And it gave a great, great experience. We, we talk a lot about stream processing and you kind of hinted at that maybe give me like, or the audience, like one minute in explanation into specifically stream processing. Like, what Yingjun Wu: do you expect from that? Yeah, actually a lot of people are not quite familiar with stream processing, right? Well, but if you consider, let's say Snowflake or Redshift. So this kind of systems are like batch system, right? We probably just upload that data from S3 into, let's say, into Redshift. And then you run to your, run a query and to generate your report, right? So basically for such a query, it's run on top of your batch data. You have probably last day's data or probably last seven days data, or probably last year's data. That's batch data, right? And then you run a query on top of this data and then generate report. Okay, what happened over the last seven days? What happened over the last one year? Right? That's batch processing. But stream processing is totally different. It's focused on fresh data and fresh results, which means that for, it'll continually seeing just the data from upstream services. And then every time a new tuple, a new data comes in, a new event comes in, it will trigger the computation and refresh the results, refresh your computation results so that you can always see. The, I mean, the latest results, you can gain the insights from your streaming streams, right? What data streams. So, I mean, people may ask, okay, when I need to have a when I need to do stream processing, I mean, think about it. I mean, I just mentioned probably stock trading, right? Or if you want to monitor, do something like IoT monitoring, right? You have a energy plant probably you really want to monitor the what happens, right? That's the. Yeah, continuously monitoring the, I mean, the voltage, right, for all these kinds of metrics. Right. So you definitely want to do stream processing, or if you want to, let's say, if you payment service, I'll probably building a, let's say a billing service. You really want to detect the frauds of their in real time, right? So you should not say, okay, I do the I, yeah, probably I will do the billing every single month. And then I found out, okay, there are a lot of frauds there. So it won't work. You should just detect a fraud in real time so as to avoid any kind of loss. Right? Yeah. So that's stream processing. Chris Engelbert: Right. And I think there is not too many actual like real time processing engines. A lot of these engines that look real time and real time, not in the hard Hardware real time way, but like soft real time. They still do batching. They call it micro batches, which is like a couple of milliseconds but it's still not real time, right? I mean, that's the main difference. And it's one of the differences to something like Hezekas Jet which also does micro batches but looks, Basically real time because the batches are so small. Yingjun Wu: Yeah, well, I think we're definitely a couple of systems that's doing micro batch, and I think one of the most famous ones is I don't know, Spark Streaming. Right. Well, because, well, I mean, Spark is like, okay, everyone knows that way. I mean, it's definitely a batch engine. And but definitely people have the demand to, I mean, to do continuous computation and they say, that's probably, I don't really want to write up my engine. Right. So why not just reuse Spark? Right. So what do we have? Reuse what we have, right? Then that's why they have the Spark streaming. Basically chop up your data streams into multiple buckets. So probably multiple. I don't know what it's called for. I mean, micro batches, right? So chop up the streams into micro batches and do computations in every single batch and then put the results together. And that's what they do. And so it's great. It's definitely great technology. And and Especially for Spark, right? So if you come from Spark, you know, and it's kind of like straight, quite straightforward, but I mean, if you really want to do something like, okay, a true real time, event driven architecture, right? Like, okay, if you receive events from your upstream services, like if you receive events from your, let's say, GitHub events, right, or probably receive the events from, let's say Twitter events, right, you probably really want to do something like, okay, monitoring, right then you. You probably, micro batch way, may not work, and then probably need to do something like a true stream processing. Chris Engelbert: So, You're talking if you, said before, if you ingest data into RisingWave, how would I do this? Yingjun Wu: Okay. So so in our case, well, I mean, we definitely support a lot of ways to ingest the data. The easiest way is that, okay, look, I probably didn't really mention this before, but the, well, RisingWave is Postgres compatible. It's not based on Postgres, but so- I spend actually spent a lot of time working Postgres in past for, especially during my PhD. But reading Wave is not well PO based on Postgres, but it is Postgres while compatible. And all the behaviors are exactly the same as Postgres. All similar have, I cannot say exactly the same. And also very similar to Postgres. And you can just run Postgres queries in the same Sal queries in the same way as you use. Postgres, so in all, in terms of the ingestion, definitely you can just, okay. Create a table, insert tuples into it. Right? So that's the easiest way, but let's say that's where you have something like, okay, Kafka, a Kafka service, a event streaming service. Then the the, the only way the only thing you need to do to ingest the data from your Kafka is to create, you know, say, in our case, where it's like create a source from Kafka. So that's where Redmoo will directly continuously ingest the data from Kafka. Oh, let's say that's okay. If you want to have let's say you don't really have Kafka, you just have a Postgres MongoDB, MySQL, whatever, right? Well, Oracle, even Oracle SQL server. So you have those kinds of services and then you want to ingest the data, right, because for that, there's data changes, right? From the upstream services. You want to capture these changes. The only way you need to do it in RisingWave is to create a source. And from, let's say Postgres, from MongoDB, whatever, right. And the definitely, so right now we have a streaming service, stream processing system, but we support ingesting data from best sources, let's say S3 or Iceberg. So why we need to do that? Oh, why we do that? Because, I mean, essentially we find that many people want to I mean join data. Join the streaming data with the batch batch data. Let's say that's where if I have a click stream and also have, let's say user info, right? Well, from user info table, from let's say from S3, starting S3. I really want to join the events, yes in the click streams with the table in order to enrich the stream before dumping the result, delivering the result into my data warehouse. All probably a data lake, right? That's what we do. So we support from ingesting. I definitely, you can, to summarize what we can, you can definitely use, well, inserting two statements, or you can use a statement called create a source to ingest the data from your, I mean, from your. The middleware is Kafka, whether it's Postgres or probably S3. Chris Engelbert: I think the easiest way, and you mentioned that, is basically creating a source and using Kafka and Debezium to capture the change events from the database and just drop them in right away. Yingjun Wu: That's right. Essentially we use Debezium yeah, internally. Yeah. So you do not need to have let's say Kafka plus Debezium. As I say, that's where if you have Postgres, you can just use RisingWave and we, we run Debezium for you. Chris Engelbert: Okay. So RisingWave is implemented in Java? Yingjun Wu: No, it is in rust. So, well so some fun fact, well, I mean, I so I'm a founder of the RisingWave, but I do not really know Rust, why? I do not really code in Rust. I actually am a c plus Pro programmer. So I started in 2021 and well, yeah, I wrote the first probably 10 or 20,000 lines of a coach. Well, probably, probably a first a hundred line, a hundred k thousand lines of a coach in CP plus, but all employees told me that, well, okay, luck. I mean, super plus is outdated for, I mean, , which I don't really agree with. Okay. And but for, we do find that's where they're a huge problem with C when you're in a small team. So the problem here is that's where, I mean, if you want to do debugging. It's super painful because we're in C we know that's where I mean, I came from, I started working on C for probably 10 or probably 15 years ago. And since then it keeps evolving, right? We have a C you have C, then C 11, nowadays, right? Probably, I don't know, probably 20 something, right? 23, right? Yeah. Yeah. It's kind of crazy and it's always evolving. And. It's really hard to for people to unify their styles, right? Well, some people probably use malloc and some people, malloc and free, and some people use for new and delete, and simple, some people use for unique pointers and spot pointers. And I know that's where it probably does some more advance of yeah, semantics nowadays. So it's kind of very, very complicated. It is. If you run into some bugs, C11. Yeah, segment fault, then you only have to probably spend a few a few days to debug. So when I was in in, in Redshift, Redshift was coding C I mean, Redshift was based on Postgres and that was C code. And then since then, well, I mean, we added C code and different versions of C code. And I still remember that's why essentially for one single bug, what it's about, well, UDF in in Redshift. I spent two one, one person spent two weeks just debugging for this bug. So that, well, that was terrible yeah experience. I never really want to do it again. So, yeah, and we hit into the same problem, very similar problem in building RisingWave in C So, and I talked to many people about this, well, in big companies, and they told me that, well, I mean, you probably should trust your employees and probably try to I mean, probably, Chris Engelbert: I definitely trust them, Yingjun Wu: okay? I think that Chris Engelbert: is a very, very good advice for, for many people. Trust your employees. You hire them for a reason. No, I mean, I Yingjun Wu: really trust them. All right. But but the thing here, that's why I don't really want to be the only one that's used is Rust, right? But definitely, I mean I think Rust ecosystem is great, right? So many systems. I have to say there are probably also thanks to crypto, right? Well, crypto, a lot, I know that's right. A lot of crypto companies actually use Rust. And then I heard the news, okay, Microsoft. Try to introduce rust into their environments. Right. And then Facebook might meta right. And then, yeah, also even my even Amazon, they also said, okay, we probably will try rust. So yeah, they decided to, yeah, basically to delete all the intel base and writing rust. So there was a pretty famous block, I dunno, probably got probably 1 million views. of that block. And that block was talking about, okay, how we migrated from the C to Rust. Yeah, so I think I know that blog Chris Engelbert: post a pretty Yingjun Wu: fun experience. Yeah. Chris Engelbert: But Debezium is implemented in Java. That's why I found, yeah. Yeah. Yingjun Wu: It's in Java. It is in Java. So there is like Chris Engelbert: a little bit of Java code to do the integration with Debezium. Yeah, yeah, yeah, yeah. Definitely. Yingjun Wu: We do not really talk shit about the Java, right. You know? Chris Engelbert: No, no. Well, Yingjun Wu: no, we, Chris Engelbert: it's fine. But , Yingjun Wu: no, I, yeah, I personally, I mean, I know that where people always say that's where Java is slow and things like that. Well, but well, to be honest, I mean. So I agree that Rust is I mean, more, more than language, right? So forget about performance, forget about anything, right? So it's more than a language, but I have to say there's a lot of, I mean, if you want to have a better experience or integration with your, I mean, the big data ecosystem, probably you still have to write some Java code. I have to say that's why I mean so like, I mean so if you check out, well, I mean, if you want to have an integration with, let's say Kafka integration with I mean, for sure, Debezium, right? Integration with some other systems, probably Spark, right? And so you actually have to have Java code. It's not in that inevitable, but we do see that's what I mean. Probably there are some performance issues. For example, I mean, we actually support sending the results from RisingWave to ISPR format. So Iceberg was in Java, so we, and we found that was syncing data from RisingWave to to Iceberg was super slow. So we, that's why we collaborated with some other companies and created a project called Iceberg Rust Iceberg OS. So you can check out for yourself Apache project on the Apache foundation. And it's basically, I mean, Rust version of Iceberg. Chris Engelbert: I love Java. Everything has J something in Rust. Everything is RS. Yeah, that's right. Yingjun Wu: Yeah. Yeah. There's another product called Delta RS, right? Where it's basically a Delta Lake. Yeah. The Rust version of Delta Lake, you know, it's pretty I was actually talking to another guy. And he was working on the Delta RS project. It's definitely I know that's why I'm, people are moving to Rust, but we have to admit that's what a lot of our projects was doing in Java and we have to be open minded and we have to be yeah, having an equation with all these kind of systems, right? We should not say that, well, okay, move everything into Rust, right? I don't really think that's going to happen there within the next five years. Chris Engelbert: Right. Yingjun Wu: But but anyways, yeah. Chris Engelbert: Alright. Yeah. Let's keep going with the topic because Sure. Yeah. , we kind of drifted off. We're a cloud podcast. So tell me about RisingWave Cloud . So, RisingWave Cloud well is probably, yeah, easiest way Yingjun Wu: to run right away, but I'm a. As a founder, I mean, if we talk to our salespeople, they will always say, that's great, okay, register or sign up our cloud platform, and then you can get it for free, right, for probably, I don't know, forever free or whatever, right? But my suggestion is always like, okay, you actually can try it out in your local machine, right? For your local laptop, right? Probably you have a Mac and try it out, right? Don't just sign up, right? Chris Engelbert: So that's the difference between salespeople and engineers. I'm the same. Yeah, I'm the same. Yingjun Wu: So, yeah, I always test all kinds test out all kinds of stuff for in my own laptop, right? Well, that's easiest way for me, at least, where probably my sales guys will probably hate me, but I have to say that, so that's the easiest way. But anyways, regarding RisingWave, I have to say that RisingWave Cloud is also a pretty easy way. One of the easiest ways. Officially, it's still the easiest way to run right away. And we run everything in Kubernetes and in AWS GCP and Azure. So let's say, let's talk about let's say talk about the AWS then it's always in Kubernetes and we use S3 as the as the storage for persisting all the data there and around the competition in Easy two. So, and we also use EBS for, for caching purposes. Yeah. Chris Engelbert: Do you use, be hosted Kubernetes services like E-K-S-K-S? Yeah, we do. All right. Yeah, we Yingjun Wu: do use EKS, yes. Chris Engelbert: All right. I still see a lot of people that say, yeah, we use Kubernetes, but we use EC2 machines and we set up our own clusters. Yingjun Wu: For us, it's more like a liability issue. We don't really want to manage too many things, but I do see that for many companies, especially enterprise companies, where they run everything on their own. So we do support for Kubernetes operator and also help. So that's where they can just deploy it in their own environment. Chris Engelbert: It is like you've read my next question. Like, okay, so there is a Kubernetes operator. That means if I want to go into production after I tried it on my own machine or I used a RisingWave Cloud and I feel like I don't want to use a RisingWave Cloud, I want to have it running my own system, my own environment, which there is good reasons, right? Compliance, security. All that kind of stuff. So I would use, you said Helmchart or the operator, but I think the operator is the better version on, and probably takes a lot of like the actual operational overhead away from you, right? Yingjun Wu: Yeah, sure. By the way, I have to say as well, I mean, it's totally depends on, I mean personal yeah, yeah. Their environment as well. So, yeah. But yeah, but I do see that's where a lot of people are using operators. Yes. Chris Engelbert: Okay. So it sounds like you're not much of a fan of operators. Yingjun Wu: Well, I mean, I'm a fan of operators for sure. And so definitely I'm also a fan of yeah, operators for sure. Yeah. Chris Engelbert: Okay. Fair enough. From my perspective, home charts are great for everything you just install, but you don't really have to. to do operation on like for example, Postgres, I would never recommend anyone deploying Postgres in Kubernetes without using an operator. Because it's, it's not just the database. There's like so much stuff that needs to be done around it. Yingjun Wu: Yeah. So actually I do. That's really recommend for self hosting Postgres because there are so many options, right, for RDS. We Chris Engelbert: agreed to disagree. Very good. Very good. We agreed to disagree. All right. For, for the sake of time because we crossed the 25 minutes, about two minutes ago. What do you think is like the next big thing we see on the horizon? Could be anything. Yingjun Wu: Yeah. So so I have to say that's what it's I mean, I bet on three things. Postgres, Kafka, and and Iceberg. So, thank you. I do not really put my bat on, so I have to say, I do not really put my bat on certain technology. I have to say that I do not really put my bat on, let's say Confluent. I do not really do that, but I do really put my bat on the, I mean, the standard, open standards. Postgres, Kafka and and iceberg. So the thing here that we look, well, the, what is Postgres? Postgres is essentially the operational database where that host for the persons fh, uh teams or companies or operational data. Well Kafka hosts per companies for the streaming data. Well, iceberg is a place, should be a place and will be the place to to host the historic data. So that's why I believe that, so if you want to build a new system yeah, in 2025, or even 2020 20 yeah, 2025 next year, then you should definitely or you want to start from now, right? Well, you should definitely bet on these three. OpenStandard, I'm not talking about technology because I know that's where, I mean, there are definitely all kinds of versions of Kafka, all kinds of versions of Postgres. But I have to say that we have to bet on the OpenStandard. And I also bet on, let's say, the the better integration and a better ecosystem. And I do believe that Postgres, Kafka, and Iceberg, a cheaper version and a more cost efficient version of this kind of standard would be the big thing for the next big thing. But yeah, people probably also talk about the AI, but AI is, well, I mean it's probably yeah, it's it's a great technology and it's it's a big thing. All right. Chris Engelbert: Fair enough. I was just about to say it's so uncommon that people do not put AI front and center that it is very refreshing every time it happens. But Well, it's an AI Yingjun Wu: podcast where they don't put AI, you know, in the center. Yeah, that is Chris Engelbert: true. I would basically say if you start building a new technology or a new product next year. We hopefully figured out that not everything needs an AI agent. So let's see. All right, last question. Is there anything else you want the audience to know? Yingjun Wu: Okay. So don't just focus on AI. So think about for the think about data infra. So, because, well, I mean, I talked to a lot of people and people say that, okay, why are you still working on data infra, right? Data infra is that, right? Things like that. So, I mean, look at the snowflake stock, and it's dropping. It's dropping every single day. So don't think about that. So I mean, Snowflake is still, is their business, right? So it's nothing to do with our business, right? We, it's nothing to do with some other people's role some other company's business, but I do believe there's a future of I mean, data, infra. It's still evolving. So look, well, I even did this for the more than data in front. If you look at a Snowflake Databricks or probably Redshift or whatever, right? These kinds of things, well, this kind of like systems are actually beautiful enterprises, but I do believe that will be on the, the developers or SMBs are underserved and they should deserve better data in front, right? So if you're talking about, let's say that's where I want to run a startup. Today, or probably tomorrow, or probably next year. Should I just start with Snowflake? No, no, no, no. Snowflake is too expensive, right? I should not start with that, right? I probably should start with Postgres or probably start with some, some simpler things, right? But I mean, Postgres, I know that's where there are a bunch of Postgres I mean, Timescale, I mean yeah. The nowadays where they essentially sell Postgres, Neon, Superbase for all these kinds of vendors. So yeah, they're great. But yeah, but, well, I mean, it's it's more than just Postgres. We have to reinvent everything reinvent Kafka, reinvent data warehouse, reinvent data lake, reinvent everything in this data stack. So I do believe that, I mean, there are a lot of things to be done in the data infraspace, so. All right. Chris Engelbert: Fair enough. You're certainly an engineer and not a sales person. That question normally goes to you. You want to sign up for the cloud? You want to try it out if you have a use case I, I tried RisingWave. I told him before the actual recording. I loved it. Really, really try it out. It's really cool technology and it has a lot of use cases. Yingjun Wu: Yeah. Yeah. I mean, if you don't really want to talk to us, I mean, talk to me, right? So I can talk about the technology, talk about, well, I mean, how to use RisingWave and whether your use case where it's like, okay, the best of fits, right? Some people told me that, okay, probably they want to store time service data. So I'll store some operational database have some operational data and do something like random stuff. I mean, I will always give my unbiased. I will not tell you that, okay, you can use RisingWave to do everything, right? So that's, yeah, I like that because, Chris Engelbert: yeah, I like that because I'm very much the same. So by the way, there is a great Slack Slack team join that if you have any questions about RisingWave. And I, I think you're, in the Slack as well, so you're always up for it. For sure. Yingjun Wu: Just DM me and I will just yeah, I will just definitely get back to you. Chris Engelbert: Yeah. Perfect. So thank you very much for being here. We're way above time, but it doesn't matter. It is what it is. That's how conversational podcasts work. Thank you for being here. Thank you for being an amazing guest and for the audience you know the spiel same place, same time next week. And let's see who will be up next.