WEBVTT 1 00:00:00.800 --> 00:00:04.600 Ah, well, the traditional problems that you have. I think 2 00:00:04.600 --> 00:00:07.920 with every piece of software that you're trying to 3 00:00:07.920 --> 00:00:10.640 productionize, it's fairly easy to build something 4 00:00:12.080 --> 00:00:15.840 that works, that solves a problem. But as soon as you 5 00:00:15.919 --> 00:00:19.320 put it out into the world, you have a few things. First of all, you 6 00:00:19.320 --> 00:00:23.080 need to make sure it's secure. Second of all, you need to make sure that 7 00:00:23.080 --> 00:00:26.880 it scales. Because if by all means I I hope for you 8 00:00:26.880 --> 00:00:30.560 as a startup it actually works and people enjoy using it, 9 00:00:30.640 --> 00:00:34.400 the next step really is how does it scale? How does it not fall 10 00:00:34.400 --> 00:00:38.200 apart under the load of people wanting to 11 00:00:38.200 --> 00:00:41.880 try? And the third thing 12 00:00:41.880 --> 00:00:45.680 really is observability, being able 13 00:00:45.760 --> 00:00:49.360 to really get telemetry, look 14 00:00:49.360 --> 00:00:52.250 into what's actually happening. Foreign. 15 00:01:02.090 --> 00:01:05.730 YouTube blog covering the German startup scene with 16 00:01:05.730 --> 00:01:09.210 news interviews and live events, 17 00:01:11.690 --> 00:01:15.370 AWS is proud to sponsor this week's episode of startup raid 18 00:01:15.370 --> 00:01:18.970 IO. The AWS startup team compromises 19 00:01:19.360 --> 00:01:22.640 former founders and CTOs, venture capitalists, 20 00:01:22.880 --> 00:01:26.720 angel investors and mentors ready to help you prove 21 00:01:26.720 --> 00:01:30.240 what's possible. Since 2013, AWS 22 00:01:30.480 --> 00:01:34.200 has supported over 280,000 startups across the 23 00:01:34.200 --> 00:01:36.800 globe and provided US$7 billion 24 00:01:38.120 --> 00:01:41.920 in credits through the AWS Activate 25 00:01:41.920 --> 00:01:45.440 Program. Big Ideas Feel at home at 26 00:01:45.440 --> 00:01:49.000 aws with access to cutting edge technologies 27 00:01:49.160 --> 00:01:52.920 like generative AI, you can quickly turn those ideas into 28 00:01:52.920 --> 00:01:56.520 marketable products. Want your own AI 29 00:01:56.520 --> 00:02:00.200 powered assistant? Try Amazon Q. Want to 30 00:02:00.200 --> 00:02:03.480 build your own AI products privately? 31 00:02:03.480 --> 00:02:07.160 Customize leading foundation models on Amazon Bedrock. 32 00:02:07.240 --> 00:02:10.840 Want to reduce the cost of AI workloads? 33 00:02:11.000 --> 00:02:14.770 AWS Trainium is the silicon you're looking for. 34 00:02:15.330 --> 00:02:19.090 Whatever your ambitions, you've already had the idea. 35 00:02:19.570 --> 00:02:22.690 Now prove it's possible on AWS. Visit 36 00:02:23.170 --> 00:02:25.730 aws.Amazon.com 37 00:02:26.450 --> 00:02:29.970 startups to get started. So you build a chatbot. 38 00:02:29.970 --> 00:02:33.650 Cool. But now your team is stuck wondering how to connect it 39 00:02:33.650 --> 00:02:37.250 to real APIs, make it reliable or roll it out 40 00:02:37.250 --> 00:02:40.930 to thousand users. That's where this episode comes in. 41 00:02:41.010 --> 00:02:44.850 AWS Dennis Troup walks us through how 42 00:02:44.850 --> 00:02:48.370 to productionize AI securely at 43 00:02:48.370 --> 00:02:52.090 scale and without breaking your app. We 44 00:02:52.090 --> 00:02:55.770 dive into agentic workflows, MCP 45 00:02:55.930 --> 00:02:59.690 and how real startups go from MVP to market. 46 00:03:00.170 --> 00:03:03.770 Let's see in our episode today in cooperation with aws. 47 00:03:04.170 --> 00:03:07.850 Dennis Straub is a developer advocate at aws. Focus 48 00:03:07.930 --> 00:03:11.660 on on helping startups and enterprises to take their 49 00:03:11.660 --> 00:03:15.260 gen experiments into real world deployment. With 50 00:03:15.260 --> 00:03:18.620 decades of experience in secure infrastructure and 51 00:03:18.620 --> 00:03:22.380 developer productivity, Dennis has helped teams across 52 00:03:22.460 --> 00:03:25.820 industries automate, integrate and scale their 53 00:03:26.060 --> 00:03:29.900 projects using modern cloud native tools. 54 00:03:30.220 --> 00:03:33.700 Today we talk about model context, protocol, agent based 55 00:03:33.700 --> 00:03:37.300 architecture and what it really takes 56 00:03:37.300 --> 00:03:40.980 to make AI Work in production Dennis, welcome back 57 00:03:40.980 --> 00:03:44.580 to celebrate. O oh thanks, 58 00:03:44.660 --> 00:03:48.460 thanks for having me again. Totally my pleasure. For everybody who's 59 00:03:48.460 --> 00:03:52.139 not aware of this, this is number two of a series of two 60 00:03:52.139 --> 00:03:55.900 interviews. But actually one of your colleagues would join us as well. 61 00:03:55.900 --> 00:03:59.740 So in total we have four interviews with you guys 62 00:03:59.740 --> 00:04:01.700 here. Dennis 63 00:04:02.670 --> 00:04:06.470 Productionizing AI Tell us about it. 64 00:04:06.470 --> 00:04:09.790 What's the hard truth? What's the biggest gap you see between 65 00:04:09.870 --> 00:04:12.910 Genai prototypes and action production systems? 66 00:04:15.310 --> 00:04:18.750 Well, the traditional problems that you have, I think with 67 00:04:18.910 --> 00:04:22.510 every piece of software that you're trying to productionize, 68 00:04:22.750 --> 00:04:26.190 it's fairly easy to build something that 69 00:04:26.190 --> 00:04:29.930 works, that solves a problem. But as soon as you put 70 00:04:29.930 --> 00:04:33.210 it out into the world, you have a few things. First of all you need 71 00:04:33.210 --> 00:04:37.050 to make sure it's secure. Second of all, you need to make sure that it 72 00:04:37.050 --> 00:04:40.890 scales. Because if by all means I hope for you as 73 00:04:40.890 --> 00:04:44.730 a startup it actually works and people enjoy using it, the 74 00:04:44.730 --> 00:04:48.130 next step really is how does it scale? How does it not fall 75 00:04:48.130 --> 00:04:51.770 apart under the load of people wanting 76 00:04:51.770 --> 00:04:55.350 to try? And the third 77 00:04:55.350 --> 00:04:59.190 thing really is observability, being 78 00:04:59.190 --> 00:05:02.670 able to really get telemetry, 79 00:05:02.910 --> 00:05:06.750 look into what's actually happening. Does it do what it's supposed to 80 00:05:06.750 --> 00:05:10.350 be doing? Am I running risks in 81 00:05:10.350 --> 00:05:13.950 terms of, for instance, running up a large bill 82 00:05:14.110 --> 00:05:17.710 because it does something that I didn't expect it to do? 83 00:05:19.550 --> 00:05:22.750 You may run into edge cases that you didn't have in your 84 00:05:22.830 --> 00:05:25.710 prototyping environment. So the traditional problems 85 00:05:26.710 --> 00:05:30.470 that you usually have when you're building something that was 86 00:05:30.470 --> 00:05:33.670 small and you put it into the market, the second, 87 00:05:34.550 --> 00:05:37.590 second thing that very often happens is as well is 88 00:05:38.550 --> 00:05:42.390 you. Most, most pieces of software are not an island. 89 00:05:42.470 --> 00:05:46.190 They have to connect to something. They have to connect to third party APIs or 90 00:05:46.190 --> 00:05:49.870 to your own APIs, or to customers, internal APIs 91 00:05:49.870 --> 00:05:53.720 and data sources. And that can be hard as well because 92 00:05:53.720 --> 00:05:57.400 how do you make sure that the agent then first of 93 00:05:57.400 --> 00:06:00.960 all securely connects to these APIs and 94 00:06:00.960 --> 00:06:04.680 second, doesn't mess with them, doesn't 95 00:06:04.680 --> 00:06:07.960 do anything that it's not supposed to be doing. That is something that you really 96 00:06:07.960 --> 00:06:11.760 need to look at. As soon as you run into production production situation, you 97 00:06:11.760 --> 00:06:14.800 will most likely have security, scalability 98 00:06:15.440 --> 00:06:18.400 and connectivity issues with 99 00:06:18.800 --> 00:06:22.310 internal and third party tools. Talking a little bit here about model 100 00:06:22.310 --> 00:06:26.110 context protocol is getting a lot of traction in 101 00:06:26.110 --> 00:06:29.750 AI engineering world. What is it and what does 102 00:06:29.750 --> 00:06:33.510 it matter now? One of the big questions that everybody had in 103 00:06:33.510 --> 00:06:37.310 regards to LLMs, to language models when they came out a few years ago, 104 00:06:37.470 --> 00:06:39.790 really was there 105 00:06:41.230 --> 00:06:44.870 a certain amount of unreliability especially when it 106 00:06:44.870 --> 00:06:48.130 comes to them producing facts. 107 00:06:49.250 --> 00:06:52.850 These models have been trained on an incredible 108 00:06:52.850 --> 00:06:56.450 amount of data, but they don't really understand the data. 109 00:06:56.530 --> 00:06:58.530 So by the nature of how these models work, 110 00:07:00.210 --> 00:07:03.850 they don't really know whether what they're saying is 111 00:07:03.850 --> 00:07:07.650 actually true or not. They just look 112 00:07:08.130 --> 00:07:11.810 whether it matches a certain pattern that they have 113 00:07:11.810 --> 00:07:15.330 seen quite often. And that's a really big limitations because in many 114 00:07:15.330 --> 00:07:19.180 applications, in many use cases it's really important to work with 115 00:07:19.180 --> 00:07:22.940 factual data and not with assumptions or with, with 116 00:07:24.540 --> 00:07:27.340 made up information that 117 00:07:28.780 --> 00:07:32.420 doesn't actually, doesn't actually represent the facts. That can be 118 00:07:32.420 --> 00:07:36.100 pretty dangerous. So most, most people in 119 00:07:36.100 --> 00:07:39.020 companies who've been working with LLMs, especially in their early days, 120 00:07:39.740 --> 00:07:42.460 ran into this problem. Well, it doesn't really work because 121 00:07:43.760 --> 00:07:46.880 it isn't really reliable in what it says, it makes up data, 122 00:07:47.360 --> 00:07:50.880 it confuses things and so forth. So 123 00:07:51.520 --> 00:07:55.080 there's basically two ways to solve this 124 00:07:55.080 --> 00:07:58.160 problem. One way is 125 00:07:59.520 --> 00:08:03.160 adding more information to the model, fine 126 00:08:03.160 --> 00:08:06.680 tuning the model, giving it access to a 127 00:08:06.680 --> 00:08:10.320 vector store so that it can do semantic search to retrieve 128 00:08:10.320 --> 00:08:13.920 actual data to, to basically either 129 00:08:13.920 --> 00:08:17.560 double check against actual data or just 130 00:08:17.560 --> 00:08:21.160 use this data as part of its process, which is called 131 00:08:21.160 --> 00:08:24.960 RAG retrieval, augmented generation. And I 132 00:08:24.960 --> 00:08:27.640 don't want to go too deep into rag and I have a very 133 00:08:28.360 --> 00:08:32.120 specific opinion of rag, not a bad one, but what it actually 134 00:08:32.120 --> 00:08:35.240 is, I don't look at it from the perspective of a data scientist. 135 00:08:35.800 --> 00:08:39.559 Actually I think RAG is the 136 00:08:39.559 --> 00:08:43.399 second thing or belongs to the second thing that we can do which is 137 00:08:43.399 --> 00:08:46.999 basically connecting the AI to the real 138 00:08:46.999 --> 00:08:50.599 world add runtime. So 139 00:08:50.679 --> 00:08:54.439 at the moment in time when I actually use 140 00:08:54.519 --> 00:08:58.039 the model, pre training, 141 00:08:58.279 --> 00:09:01.919 fine tuning and these things, they all happen before 142 00:09:01.919 --> 00:09:05.200 I deploy the model into production and everything 143 00:09:05.440 --> 00:09:09.280 after that happens when the model is in production, when my 144 00:09:09.280 --> 00:09:12.720 application runs and RAG tool use 145 00:09:13.440 --> 00:09:16.880 and other mechanisms basically solve this problem, 146 00:09:17.280 --> 00:09:21.000 connecting the AI to the real world. And that could be a database, 147 00:09:21.000 --> 00:09:24.600 that could be a vector store with a semantic search, but that could also be 148 00:09:24.600 --> 00:09:28.400 something where the AI can actually act like call an 149 00:09:28.400 --> 00:09:32.160 API to trigger a process or to send an email 150 00:09:32.400 --> 00:09:36.110 or, or do things like that. And the challenge 151 00:09:36.110 --> 00:09:39.790 was that every API works differently. Every 152 00:09:39.790 --> 00:09:43.590 API has a different authentication mechanism, every database has a slightly different 153 00:09:43.590 --> 00:09:47.430 SQL dialect and so forth. So all the tools 154 00:09:47.430 --> 00:09:50.790 that I wanted to use, starting from a simple 155 00:09:50.790 --> 00:09:54.350 calculator, all the way to a weather API or to 156 00:09:55.150 --> 00:09:58.990 a third party SaaS, providers API or anything, they 157 00:09:58.990 --> 00:10:01.190 all have their proprietary 158 00:10:02.230 --> 00:10:05.950 API. So it was my task as a 159 00:10:05.950 --> 00:10:09.750 developer to basically manually 160 00:10:09.830 --> 00:10:13.510 code the connectivity the connection to all of these APIs 161 00:10:13.590 --> 00:10:17.190 to be able to get the data or send a request or do whatever 162 00:10:17.190 --> 00:10:20.150 to get it into the model or to have the model act on my process. 163 00:10:21.110 --> 00:10:24.950 MCP Model Context Protocol is an open source 164 00:10:25.750 --> 00:10:29.210 protocol nad has been 165 00:10:29.210 --> 00:10:32.730 published and recommended by Anthropic. It's being widely adopted 166 00:10:32.730 --> 00:10:36.530 by Google, Amazon, Microsoft and many others. 167 00:10:36.850 --> 00:10:40.690 It looks like it's actually turning into an industry standard, 168 00:10:40.690 --> 00:10:43.890 which is a good thing because it simplifies 169 00:10:45.810 --> 00:10:49.410 that connection layer between the 170 00:10:49.410 --> 00:10:52.860 actual tool or API and 171 00:10:52.940 --> 00:10:56.380 your agent. It's a fairly 172 00:10:56.380 --> 00:11:00.220 simplified protocol that contains a few primitives like tools. 173 00:11:00.540 --> 00:11:03.860 So what can I actually do with that 174 00:11:03.860 --> 00:11:07.260 API as a model? That tool could be 175 00:11:08.140 --> 00:11:11.900 a browser engine, could be a runtime for me to run 176 00:11:12.700 --> 00:11:16.420 code in a sandbox environment, could be an API to a third 177 00:11:16.420 --> 00:11:18.060 party provider, could be 178 00:11:20.250 --> 00:11:23.850 a piece of code that I wrote myself, could be basically everything. And with mcp, 179 00:11:23.850 --> 00:11:27.690 you're able to just wrap this as a so called server, 180 00:11:28.330 --> 00:11:32.050 it's an MCP server. And then you can use the client 181 00:11:32.050 --> 00:11:34.650 library of the SDK inside of your own application 182 00:11:36.490 --> 00:11:38.970 to just connect to the server and 183 00:11:42.250 --> 00:11:46.050 include it into your agent so that you don't have to rebuild 184 00:11:46.050 --> 00:11:49.670 it. For every single API that you to want to use, that's, that's mcp. 185 00:11:49.670 --> 00:11:53.270 It has a few more primitives like it's able to send 186 00:11:53.270 --> 00:11:56.750 notifications to the client, it's able to provide 187 00:11:56.830 --> 00:12:00.670 resources, static resources to the provider, and a few more. 188 00:12:00.990 --> 00:12:04.630 But the most important and most interesting use case for most 189 00:12:04.630 --> 00:12:08.030 AI applications really is the fact that it can expose 190 00:12:09.310 --> 00:12:12.750 its capabilities or the capabilities of the underlying 191 00:12:12.750 --> 00:12:16.550 API as tools that the LLM can 192 00:12:16.550 --> 00:12:20.150 understand and use. You said 193 00:12:20.150 --> 00:12:23.910 before that AI systems are only as powerful as their 194 00:12:23.910 --> 00:12:27.470 connections. What does it take to connect models to real world 195 00:12:27.470 --> 00:12:31.310 APIs or workflows? I'm not sure how to answer 196 00:12:31.310 --> 00:12:35.150 this question. You need, you need. It 197 00:12:35.150 --> 00:12:38.270 takes a few things on a few different levels. Well, first of all, you should 198 00:12:38.270 --> 00:12:41.670 be aware of what you want to connect to 199 00:12:41.990 --> 00:12:44.950 and also the implications of 200 00:12:45.450 --> 00:12:49.290 connecting to these things. For instance, if you connect to your, 201 00:12:49.450 --> 00:12:53.210 to your email account, you give the model effectively access 202 00:12:53.290 --> 00:12:56.930 to all your data, possibly including pii, most 203 00:12:56.930 --> 00:13:00.450 likely including sensitive and personally identifiable 204 00:13:00.450 --> 00:13:03.690 information. That is something that you just need to be aware of. 205 00:13:04.170 --> 00:13:07.770 So once you connect your AI to this system and 206 00:13:07.850 --> 00:13:11.580 connect it to another system, you need to 207 00:13:11.580 --> 00:13:15.220 be aware of the fact that technically these two connected, 208 00:13:15.540 --> 00:13:19.060 these two systems are connected through an intermediary, 209 00:13:19.700 --> 00:13:23.260 but they are effectively connected. That is something. So it 210 00:13:23.260 --> 00:13:26.100 takes, it takes thinking about 211 00:13:27.380 --> 00:13:31.180 what do I Connect and do I really want 212 00:13:31.180 --> 00:13:34.260 these things to be connected with each other? 213 00:13:34.820 --> 00:13:38.660 We're talking about boundaries, we're talking about service isolation, 214 00:13:39.380 --> 00:13:43.200 something that we've been talking about software architecture for a very long time time, 215 00:13:43.840 --> 00:13:47.520 service isolation, just to make sure that a 216 00:13:47.520 --> 00:13:51.240 service which could be an AI agent only does what it's 217 00:13:51.240 --> 00:13:54.080 supposed to do and doesn't accidentally 218 00:13:54.480 --> 00:13:56.880 expose information 219 00:13:59.120 --> 00:14:02.920 to something else, even though it shouldn't. And this is 220 00:14:02.920 --> 00:14:06.640 even more important with LLMs because LLMs are by 221 00:14:06.640 --> 00:14:10.160 nature not deterministic. You cannot just 222 00:14:10.240 --> 00:14:13.160 code the, the model so that it's 223 00:14:14.270 --> 00:14:18.030 doesn't cross this boundary. You could try, but it's really 224 00:14:18.030 --> 00:14:21.790 hard and I wouldn't, I wouldn't suggest you do that. So 225 00:14:22.030 --> 00:14:25.750 looking at this, let's say I have two, I 226 00:14:25.750 --> 00:14:29.510 have two systems and I need a connection between these two systems, one 227 00:14:29.510 --> 00:14:33.310 being my email, the other being maybe a chatbot 228 00:14:33.550 --> 00:14:36.830 that I provide to a customer as part of my website. 229 00:14:38.270 --> 00:14:42.030 I wouldn't want this to be one thing. I wouldn't want this to be one 230 00:14:42.030 --> 00:14:45.810 agent. I would want to separate these two, isolate these two. One could be 231 00:14:45.810 --> 00:14:49.410 an agent that talks to the customer through a 232 00:14:49.410 --> 00:14:52.810 chatbot interface that then realizes, well, we do what 233 00:14:53.050 --> 00:14:56.810 that classifies the use case. Well, is this a complaint or is this an 234 00:14:56.810 --> 00:15:00.250 order or is it a general inquiry? And then 235 00:15:01.930 --> 00:15:05.570 if it's a complaint, this 236 00:15:05.570 --> 00:15:09.290 agent may then send a message 237 00:15:09.290 --> 00:15:13.110 to a different agent that is responsible for 238 00:15:13.830 --> 00:15:17.670 processing customer complaints and does that. And 239 00:15:20.070 --> 00:15:22.390 they don't share data between each other. 240 00:15:24.150 --> 00:15:27.910 It's more of a handoff situation where the classifying agents 241 00:15:28.150 --> 00:15:31.990 says, well, I call the compliant agent, 242 00:15:32.230 --> 00:15:35.910 tell them there's a complaint from this customer, and this agent 243 00:15:35.910 --> 00:15:39.530 comes back with maybe, okay, 244 00:15:39.610 --> 00:15:42.250 thanks. Somebody's going to call you 245 00:15:43.450 --> 00:15:47.090 and tells this the actual, the agent that's talking to the customer. And the 246 00:15:47.090 --> 00:15:50.850 agent tells the customer, well, okay, somebody's going to call 247 00:15:50.850 --> 00:15:54.610 you or could you give me your email address or phone number? Would 248 00:15:54.610 --> 00:15:57.370 you like to be called or would you like to get an email? How do, 249 00:15:57.370 --> 00:16:00.810 how would you like us to contact you? So the 250 00:16:00.810 --> 00:16:04.410 negotiation with the customer and the negotiation with 251 00:16:04.410 --> 00:16:07.050 the CRM with access 252 00:16:08.410 --> 00:16:12.170 to sensitive information should be isolated. That is 253 00:16:12.170 --> 00:16:15.850 something that I would have in mind 254 00:16:16.090 --> 00:16:18.890 from the start if I wanted to do something like this. 255 00:16:20.010 --> 00:16:22.970 I'm not sure. Did I answer your question? Well, 256 00:16:23.370 --> 00:16:26.930 partially. I 257 00:16:26.930 --> 00:16:30.650 totally do understand that we are very, very early in this whole world 258 00:16:31.050 --> 00:16:34.870 of AI agents and how the systems work. And 259 00:16:35.030 --> 00:16:38.870 currently I, without like global standards, I do believe 260 00:16:39.270 --> 00:16:41.350 you can really completely 261 00:16:43.030 --> 00:16:46.630 answer that questions. But let's talk about 262 00:16:46.870 --> 00:16:50.390 Agentic workflows, how they are different from simple 263 00:16:50.549 --> 00:16:52.790 prompt chains or aig 264 00:16:53.750 --> 00:16:57.510 RAG pipelines. Sorry, RAG pipelines? Yeah. 265 00:16:59.430 --> 00:17:02.910 So rack pipelines are completely different. Beast rack 266 00:17:02.910 --> 00:17:06.560 pipelines themselves don't actually. Well, they actually involve 267 00:17:07.040 --> 00:17:10.120 AI in a certain way, but not in the way that we talk about it 268 00:17:10.120 --> 00:17:13.680 right now. RAC pipelines are more a 269 00:17:13.680 --> 00:17:17.400 data preparation. How do you prepare your 270 00:17:17.400 --> 00:17:20.560 data, your siloed information, your customer, 271 00:17:21.200 --> 00:17:25.000 your CRM, your product database, whatever you have? How do you prepare 272 00:17:25.000 --> 00:17:27.680 that so it can be used in a useful way 273 00:17:29.360 --> 00:17:32.540 by an AI? The other thing, prompt 274 00:17:33.420 --> 00:17:37.060 chains. Prompt chains can be part of an agent. So first of 275 00:17:37.060 --> 00:17:40.900 all, but very often you actually don't even need an 276 00:17:40.900 --> 00:17:44.380 agent. But a prompt chain could be enough. A prompt chain just being 277 00:17:44.620 --> 00:17:48.420 you have a fairly deterministic workflow where you send something to an 278 00:17:48.420 --> 00:17:51.860 LLM, you get a response, maybe you send a second prompt based on the 279 00:17:51.860 --> 00:17:54.540 response, and then at some point the thing is done. 280 00:17:56.870 --> 00:18:00.590 The way I look at it is I distinguish between three types of 281 00:18:00.590 --> 00:18:04.390 agentic applications, or not agentic, three types of 282 00:18:04.390 --> 00:18:05.270 LLM or 283 00:18:07.750 --> 00:18:11.110 generative AI enhanced applications. The first being 284 00:18:11.590 --> 00:18:15.390 non agentic. That are all the use cases 285 00:18:15.390 --> 00:18:18.550 where you send something to an LLM and the LLM 286 00:18:18.710 --> 00:18:22.550 responds and that's it. Or where you send something to the. 287 00:18:22.550 --> 00:18:26.390 Or a chatbot, you chat with the LLM, the 288 00:18:26.390 --> 00:18:29.710 LLM response and you have a back and forth between the person 289 00:18:30.670 --> 00:18:34.110 and the model. That's non 290 00:18:34.110 --> 00:18:37.710 agentic workflows, they can become quite 291 00:18:37.710 --> 00:18:41.270 complex and complicated, but they are mostly 292 00:18:41.270 --> 00:18:44.510 predetermined. Either they're a loop like in a chat, 293 00:18:44.830 --> 00:18:48.670 or there's a sequence of steps that needs to 294 00:18:48.670 --> 00:18:52.350 be done and this sequence is almost always 295 00:18:52.590 --> 00:18:56.430 the same or maybe has some decisions in between that you can 296 00:18:56.430 --> 00:18:59.070 do with traditional conditional 297 00:19:00.990 --> 00:19:04.790 steps. The second being. So the first of the three 298 00:19:04.790 --> 00:19:08.350 is non agentic. The second is agentic 299 00:19:08.350 --> 00:19:10.350 AI. This is where 300 00:19:12.910 --> 00:19:16.760 the AI, the model, actually makes 301 00:19:16.760 --> 00:19:20.520 decisions, plans. Does 302 00:19:20.520 --> 00:19:24.320 reasoning understands that it doesn't 303 00:19:24.320 --> 00:19:27.880 have all the information it needs, asks you about 304 00:19:28.200 --> 00:19:31.919 this information, or reaches out to one of the tools 305 00:19:31.919 --> 00:19:35.760 via mcp, the tools that it has available to 306 00:19:35.760 --> 00:19:39.400 get the information it needs for this specific use case, and is 307 00:19:39.400 --> 00:19:43.100 able to adapt the workflow based on the 308 00:19:43.100 --> 00:19:46.860 interaction, based on the available information, based on its 309 00:19:46.860 --> 00:19:50.420 own reasoning. So very often an agentix system 310 00:19:51.940 --> 00:19:55.460 starts by analyzing the 311 00:19:55.460 --> 00:19:57.780 task, coming up with a plan, 312 00:19:59.060 --> 00:20:02.660 maybe even storing that plan somewhere in the file 313 00:20:02.660 --> 00:20:06.380 system as a checklist for itself, using a tool, 314 00:20:06.380 --> 00:20:10.060 again using MCP, for instance, that gives it access to 315 00:20:10.060 --> 00:20:13.880 a contained file system, a 316 00:20:13.880 --> 00:20:17.440 temporary directory, basically where it can store intermediate 317 00:20:17.440 --> 00:20:21.200 information. So it puts its checklist, its plan in there and 318 00:20:21.200 --> 00:20:25.040 then it does something and then it goes back to the checklist, checks this thing 319 00:20:25.040 --> 00:20:28.840 off and then it realizes, well, for this I need some more information from 320 00:20:28.840 --> 00:20:32.560 the customer, goes back, chats with the customer 321 00:20:32.560 --> 00:20:36.080 to receive this information and so forth. So an 322 00:20:36.080 --> 00:20:37.680 agent basically 323 00:20:40.820 --> 00:20:43.060 perceives, acts. 324 00:20:44.740 --> 00:20:48.340 So it gets information from 325 00:20:48.340 --> 00:20:51.860 either from the user through the prompt or through a tool, through a 326 00:20:51.860 --> 00:20:53.620 database or something, gets information, 327 00:20:55.460 --> 00:20:58.500 decides based on this information and then acts 328 00:20:58.980 --> 00:21:02.500 within certain bounds. And these boundaries are basically 329 00:21:02.500 --> 00:21:05.990 defined by the use case and the capabilities of this agent. 330 00:21:06.150 --> 00:21:09.750 The third so non agentic, gentic and the third basically 331 00:21:09.750 --> 00:21:13.310 being multi agent systems. This is where multiple 332 00:21:13.310 --> 00:21:16.790 agents interact with each other to, 333 00:21:18.070 --> 00:21:21.429 to provide for an even more 334 00:21:21.429 --> 00:21:25.230 complex use case. This is something that 335 00:21:25.230 --> 00:21:28.550 I talked about before where you may have an agent 336 00:21:28.870 --> 00:21:32.630 that interacts with the customer through the website and is able to 337 00:21:33.780 --> 00:21:37.620 kick off different processes depending on the customer and what they want. 338 00:21:37.940 --> 00:21:41.700 And this agent then communicates with an agent that's responsible 339 00:21:41.700 --> 00:21:45.540 for complaints and another agent that's responsible for ingesting orders 340 00:21:46.740 --> 00:21:47.620 and so forth. 341 00:21:51.060 --> 00:21:54.660 So it can become infinitely complicated. 342 00:21:55.620 --> 00:21:59.060 I'm basically thinking about agents as like I think about 343 00:21:59.060 --> 00:22:02.710 microservices. Actually I would say agent is 344 00:22:02.710 --> 00:22:06.390 when the AI deviates from simple. If this, then that rules, 345 00:22:06.870 --> 00:22:10.430 would that be a good definition? Even 346 00:22:10.430 --> 00:22:13.830 complex? If this and that rules. There are 347 00:22:13.910 --> 00:22:16.790 fairly complex workflows that can be 348 00:22:16.790 --> 00:22:18.790 deterministically defined 349 00:22:20.310 --> 00:22:23.910 where the entire work process, no matter how 350 00:22:25.030 --> 00:22:27.600 complex the process is, is in itself 351 00:22:28.960 --> 00:22:31.840 is deterministic and 352 00:22:32.800 --> 00:22:36.520 algorithmic. So you can do it with step functions or you can do 353 00:22:36.520 --> 00:22:40.280 it with a, with an orchestration engine or something like that. In 354 00:22:40.280 --> 00:22:44.080 that case I wouldn't necessarily use AI, an AI agent, 355 00:22:44.240 --> 00:22:47.840 I might use AI, for instance, as a front end to that process 356 00:22:48.080 --> 00:22:51.680 that understands natural language. So if I want to have, 357 00:22:51.920 --> 00:22:55.520 if I want to have, want to have the 358 00:22:56.400 --> 00:23:00.000 possibility to use Slack, do 359 00:23:00.400 --> 00:23:04.080 anything through Slack, I have an agent in Slack and I'm able to 360 00:23:04.080 --> 00:23:07.800 tell the agent, please do this for me. In the past 361 00:23:07.800 --> 00:23:11.520 I would have to use a very specific command format for Slack 362 00:23:11.520 --> 00:23:15.280 Ops. And now with LLMs I can use natural language 363 00:23:15.280 --> 00:23:18.680 and I just use natural language. Then I have the LLM as 364 00:23:18.680 --> 00:23:22.140 basically the client and the 365 00:23:22.140 --> 00:23:25.700 LLM is not an agent. It just takes the 366 00:23:25.700 --> 00:23:29.460 request and has a list of processes, classifies 367 00:23:29.460 --> 00:23:33.220 the request, extracts the relevant information and 368 00:23:33.220 --> 00:23:36.660 kicks off the processes. It becomes 369 00:23:36.740 --> 00:23:40.380 agentic as soon as, yeah, it becomes agentic as 370 00:23:40.380 --> 00:23:43.860 soon as this thing 371 00:23:44.820 --> 00:23:48.100 may have to make more involved decisions like 372 00:23:49.030 --> 00:23:52.350 maybe we need more information or maybe we have to call 373 00:23:52.350 --> 00:23:56.070 somebody. It becomes fairly complicated fairly quickly. So it's really hard 374 00:23:56.470 --> 00:23:59.790 to talk about it. But what I'm trying to say really is don't build an 375 00:23:59.790 --> 00:24:02.790 agent for everything. First of all, don't use AI 376 00:24:03.270 --> 00:24:05.190 if the problem can be solved without 377 00:24:06.870 --> 00:24:10.590 in a fairly easy way. Second, don't build an agent if it can 378 00:24:10.590 --> 00:24:14.390 be done by a simple deterministic workflow, 379 00:24:14.390 --> 00:24:17.710 even if it involves an LLM. And don't 380 00:24:18.590 --> 00:24:22.230 build complicated. Well, let me say it this 381 00:24:22.230 --> 00:24:25.950 way. If the solution to your problem is more complicated 382 00:24:25.950 --> 00:24:27.870 than the problem you're trying to solve, 383 00:24:29.550 --> 00:24:33.230 you are probably doing it wrong, 384 00:24:33.950 --> 00:24:37.750 if that makes sense. I see, 385 00:24:37.750 --> 00:24:41.510 I see. What does a typical production AI stack looks 386 00:24:41.510 --> 00:24:45.150 like in 2025, especially for startups that are scaling fast? 387 00:24:45.790 --> 00:24:49.230 Well, you need a few things. One is obviously model 388 00:24:49.230 --> 00:24:53.030 serving. So you need the model 389 00:24:53.030 --> 00:24:56.790 somewhere. Could be locally, could be something that you do yourself. As a 390 00:24:56.790 --> 00:25:00.430 startup. I wouldn't recommend doing that as a startup. I would really 391 00:25:00.430 --> 00:25:03.870 just, really just suggest you use an existing model 392 00:25:03.870 --> 00:25:07.590 provider that provides the 393 00:25:07.590 --> 00:25:11.350 model through an API. Could be Amazon Bedrock, for instance. We have lots of 394 00:25:11.350 --> 00:25:15.070 different models and the list is growing. We have open weights model like Llama, 395 00:25:15.070 --> 00:25:18.330 Mistral, Deep Seq, Others 396 00:25:18.570 --> 00:25:22.370 we have commercial models like Claude, our 397 00:25:22.370 --> 00:25:24.810 own Nova model family we have. 398 00:25:26.730 --> 00:25:30.410 Now I'm blanking out. We have Cohere. We have many different models 399 00:25:30.490 --> 00:25:33.930 that you can use for different use cases. We have many, also many 400 00:25:34.250 --> 00:25:37.730 general purpose language models. So you could use Amazon 401 00:25:37.730 --> 00:25:41.490 Bedrock to just talk to a model through an API, through 402 00:25:41.490 --> 00:25:45.240 a secure API, while it's secure, so you 403 00:25:45.240 --> 00:25:48.680 don't have to worry about what does the model provider do with your data. We 404 00:25:48.680 --> 00:25:52.000 don't do anything with it. We don't even use it for model training. 405 00:25:52.640 --> 00:25:56.120 We just provide the model to you so that you can use it in a 406 00:25:56.120 --> 00:25:59.360 secure way, so that you can even build GDPR compliance 407 00:25:59.920 --> 00:26:03.600 systems. And so that's model 408 00:26:03.600 --> 00:26:07.400 serving. You may need databases, either your own 409 00:26:07.400 --> 00:26:11.010 databases, maybe a vector store if you want to do some semantic search. 410 00:26:11.170 --> 00:26:15.010 But that's advanced, I wouldn't start with that. And you need 411 00:26:15.010 --> 00:26:18.770 something that orchestrates the process. So 412 00:26:18.770 --> 00:26:21.570 something that basically takes the input 413 00:26:22.050 --> 00:26:25.850 calls the actual language model. Because the language model itself cannot do 414 00:26:25.850 --> 00:26:29.170 anything. It only takes text 415 00:26:30.050 --> 00:26:33.890 and, or depending on the modality, we're talking about language models right now. So it 416 00:26:33.890 --> 00:26:37.570 takes text and it produces something based on that text. It doesn't do anything. 417 00:26:37.890 --> 00:26:41.490 So the orchestration engine connects to 418 00:26:41.490 --> 00:26:45.330 MCP tools, to the front end, to whatever you 419 00:26:45.330 --> 00:26:48.890 want to do, and The LLM and the orchestration framework 420 00:26:49.930 --> 00:26:52.970 could be one of the many open source frameworks that we have, like 421 00:26:53.210 --> 00:26:55.850 Langgraph is one or 422 00:26:56.890 --> 00:27:00.610 Llama Index is another one. True AI is one strands agents. 423 00:27:00.610 --> 00:27:03.290 This is one that we have open sourced about two months ago, 424 00:27:04.410 --> 00:27:08.160 which is model agnostics, even provider agnostics. You can use 425 00:27:08.160 --> 00:27:11.880 trans agents with models from OpenAI 426 00:27:12.280 --> 00:27:16.000 or directly with Llama API or 427 00:27:16.000 --> 00:27:19.640 even with Olama on your own machine. So you need an orchestration 428 00:27:20.120 --> 00:27:23.800 tool or engine, you need a 429 00:27:23.800 --> 00:27:27.520 model somewhere. You may need a database or some data 430 00:27:27.520 --> 00:27:30.760 for the model to work with. You might want to think about 431 00:27:31.400 --> 00:27:35.100 primary and secondary models that's a bit more advanced than as well. So primary 432 00:27:35.100 --> 00:27:37.660 models is the general purpose model 433 00:27:38.780 --> 00:27:42.620 that does the majority of the work. And then you may want to use 434 00:27:42.620 --> 00:27:46.220 secondary models for instance for very simple use cases, so you don't 435 00:27:46.220 --> 00:27:49.980 need to use the expensive ones. You can use very cost effective models 436 00:27:49.980 --> 00:27:53.780 for simple summarization tasks, while you may want to use a 437 00:27:53.780 --> 00:27:57.300 more expensive reasoning model for the overall orchestration for 438 00:27:57.300 --> 00:28:00.780 instance. Another thing that's part of the stack is evaluation and 439 00:28:00.780 --> 00:28:04.530 monitoring. And that is something where I really would say as 440 00:28:04.530 --> 00:28:07.130 a startup you should put that in place as early as possible. 441 00:28:08.170 --> 00:28:11.970 Monitoring. I think monitoring observability is self explanatory. You 442 00:28:11.970 --> 00:28:15.530 should be able to see what's happening. And you should also very 443 00:28:15.530 --> 00:28:19.290 early implement cost monitoring. Because if something 444 00:28:19.290 --> 00:28:23.050 goes wrong, especially in a non deterministic system 445 00:28:23.050 --> 00:28:25.450 like agentic AI system, 446 00:28:28.750 --> 00:28:32.550 if it runs into a loop, it may run up 447 00:28:32.550 --> 00:28:36.150 a big context that it sends to the recursively 448 00:28:36.150 --> 00:28:39.750 sends to the LLM and all of a sudden it become very expensive. You wouldn't 449 00:28:39.750 --> 00:28:43.509 want that. So please set up cost monitoring 450 00:28:43.509 --> 00:28:47.310 very early. And also I would recommend 451 00:28:48.110 --> 00:28:51.950 implementing an evaluation mechanism. Evaluation is 452 00:28:51.950 --> 00:28:54.360 basically testing but full LLMs. So 453 00:28:57.080 --> 00:29:00.880 you take a specific model and you have a number of prompts for your 454 00:29:00.880 --> 00:29:04.640 system and some data that you get from your database or 455 00:29:04.640 --> 00:29:08.360 through RAG or through mcp and you plug these things together and 456 00:29:08.360 --> 00:29:11.800 you test them in different scenarios, maybe with different user 457 00:29:11.800 --> 00:29:15.360 inputs to a point where in most cases you are 458 00:29:15.360 --> 00:29:19.160 satisfied with the response. So you have you reach 459 00:29:19.160 --> 00:29:22.970 a certain threshold of reliability of your 460 00:29:23.370 --> 00:29:26.090 system to do what it's supposed to do. 461 00:29:27.050 --> 00:29:29.450 All of a sudden a model provider 462 00:29:31.450 --> 00:29:34.810 deploys an update of their model, a new version 463 00:29:35.690 --> 00:29:39.450 that has been trained on different data or has been fine tuned in a different 464 00:29:39.450 --> 00:29:41.770 way, which could break your 465 00:29:42.970 --> 00:29:46.410 system apart because a variable, a very important 466 00:29:46.410 --> 00:29:48.650 variable has changed. Or 467 00:29:50.830 --> 00:29:54.510 maybe you change your prompts that 468 00:29:54.510 --> 00:29:58.070 you use as part of the pipeline or Your data changes, the 469 00:29:58.070 --> 00:30:01.630 structure of your data changes. This could all lead to 470 00:30:03.070 --> 00:30:06.910 the fact that your overall system degrades in terms of 471 00:30:06.910 --> 00:30:08.750 reliability when it comes to 472 00:30:11.310 --> 00:30:15.070 how good the results are. And you can solve that by 473 00:30:15.070 --> 00:30:18.870 implementing an evaluation pipeline so that whenever you change 474 00:30:18.870 --> 00:30:22.430 anything, you run a 475 00:30:22.430 --> 00:30:25.990 number of prompts or a number of use 476 00:30:25.990 --> 00:30:29.630 cases against the system to see 477 00:30:29.630 --> 00:30:33.349 if the reliability drops beneath your threshold. And 478 00:30:33.349 --> 00:30:36.830 if it does, the test fails and you have to go look at it. 479 00:30:36.990 --> 00:30:40.830 That's very important. If you implement something like this as early as possible, 480 00:30:41.070 --> 00:30:44.430 just like with testing in general, you will be able to 481 00:30:44.740 --> 00:30:48.260 to iterate with much quicker than 482 00:30:48.340 --> 00:30:51.900 if every time there's a new model update, or every time the data structure 483 00:30:51.900 --> 00:30:54.820 changes, or every time you update your own prompts, 484 00:30:56.580 --> 00:31:00.020 the system falls apart because you realize it doesn't 485 00:31:00.020 --> 00:31:03.700 reliably create the solutions or 486 00:31:03.700 --> 00:31:07.420 the responses that I was looking for. Apart from 487 00:31:07.420 --> 00:31:11.140 that, well, we were at the question, what's part of the stack? So model 488 00:31:11.460 --> 00:31:14.120 serving, data access and orchestration, 489 00:31:15.240 --> 00:31:18.760 then mostly primary model to start with. And I would suggest 490 00:31:18.920 --> 00:31:22.200 just starting with a primary model, then evaluation and monitoring, 491 00:31:22.440 --> 00:31:25.960 then maybe a data pipeline if you actually want 492 00:31:26.040 --> 00:31:29.680 to use live data that changes over time. But again that's a fairly advanced 493 00:31:29.680 --> 00:31:33.520 topic and obviously security and compliance. Whenever you 494 00:31:33.520 --> 00:31:37.000 use sensitive data, whenever you use proprietary information, 495 00:31:37.320 --> 00:31:41.070 make sure that you comply to your internal 496 00:31:41.470 --> 00:31:45.230 compliance frameworks, to your customers, compliance frameworks to 497 00:31:45.230 --> 00:31:48.750 legal compliance frameworks, make sure that you use proper 498 00:31:48.990 --> 00:31:52.790 authentication, make sure that your agent can only do what it's supposed to 499 00:31:52.790 --> 00:31:56.510 do, that your agent doesn't have access to your customer 500 00:31:56.510 --> 00:32:00.150 database, while also chatting on the Internet with random 501 00:32:00.150 --> 00:32:03.670 people and perhaps by 502 00:32:03.670 --> 00:32:07.020 accident giving them access to your customer database. 503 00:32:07.570 --> 00:32:11.170 That's important. Security and compliance, that's part of 504 00:32:11.250 --> 00:32:15.010 any production stack and should be 505 00:32:15.090 --> 00:32:18.810 because these are the basic things that you need to make 506 00:32:18.810 --> 00:32:22.370 sure that you don't run into problems, most likely 507 00:32:22.370 --> 00:32:26.090 earlier than later. How do you 508 00:32:26.090 --> 00:32:29.850 guys from AWS support this kind of production 509 00:32:29.850 --> 00:32:33.370 grade AI stack from bedrock to step functions to vector 510 00:32:33.370 --> 00:32:36.610 DBs? Well, first of all we have a number of services 511 00:32:37.810 --> 00:32:40.210 around the bedrock family of services, 512 00:32:41.650 --> 00:32:45.290 and that is Amazon Bedrock itself, which 513 00:32:45.290 --> 00:32:48.770 is first of all model serving, 514 00:32:49.570 --> 00:32:53.410 where we provide secure and private 515 00:32:53.490 --> 00:32:56.610 access to models from different providers, including 516 00:32:57.330 --> 00:33:01.010 the current frontier models of most providers 517 00:33:01.250 --> 00:33:03.820 where you can just use models and be sure that 518 00:33:04.940 --> 00:33:08.580 your data is not being used for training or used for 519 00:33:08.580 --> 00:33:12.380 anything else. So we basically run these models 520 00:33:12.380 --> 00:33:15.700 inside of our own escrow accounts. They are air 521 00:33:15.700 --> 00:33:19.260 gapped. Everything you send to the model 522 00:33:19.420 --> 00:33:23.100 is not stored or reused for anything. 523 00:33:23.500 --> 00:33:27.260 It's just sent to the model. The model's basically brought to 524 00:33:27.260 --> 00:33:30.790 life, loaded into the GPU cluster. It 525 00:33:30.790 --> 00:33:34.430 runs and then it returns the response and then the 526 00:33:34.430 --> 00:33:38.270 models basically goes back down and all the data is gone 527 00:33:39.070 --> 00:33:42.910 apart from the actual model weights themselves because they 528 00:33:42.910 --> 00:33:46.709 need to be used for subsequent calls. So that is one thing. 529 00:33:46.709 --> 00:33:49.710 Amazon Bedrock, which provides access to models, 530 00:33:50.350 --> 00:33:53.470 including our own family of models, including 531 00:33:56.750 --> 00:34:00.330 the capability to actually fine tune certain models 532 00:34:00.970 --> 00:34:04.530 or distill models into 533 00:34:04.530 --> 00:34:08.130 smaller models. If you want to use smaller models, basically you want to use, 534 00:34:08.130 --> 00:34:11.930 let's say you want to use Llama Llama 4, but you don't want to use 535 00:34:12.490 --> 00:34:16.090 the version that Meta provides. You want to distill this into a 536 00:34:16.090 --> 00:34:19.930 smaller model. You can do that with Bedrock. Again, very advanced features. 537 00:34:19.930 --> 00:34:23.490 I would not suggest starting with that. It's 538 00:34:23.490 --> 00:34:27.300 time intensive, it's costly, it's a use 539 00:34:27.300 --> 00:34:31.140 case for enterprises and it may be a use case for you 540 00:34:31.140 --> 00:34:34.700 once you are further ahead on the road in 541 00:34:34.700 --> 00:34:38.420 adoption. The second thing that we provide is Bedrock 542 00:34:38.420 --> 00:34:42.100 Guardrails along with a few other capabilities. 543 00:34:42.260 --> 00:34:45.900 Bedrock Knowledge Base guardrails is 544 00:34:45.900 --> 00:34:49.700 basically to mask sensitive data or 545 00:34:49.700 --> 00:34:53.540 to block requests that contain sensitive data or responses that contain 546 00:34:53.540 --> 00:34:57.320 sensitive data or to block requests or responses, responses 547 00:34:57.320 --> 00:35:01.080 that violate ethical codes that you've defined or something like 548 00:35:01.080 --> 00:35:04.880 that. And Knowledge base is basically 549 00:35:04.880 --> 00:35:08.720 direct access to vector stores. And we also provide 550 00:35:08.880 --> 00:35:12.400 obviously services for vector stores with OpenSearch 551 00:35:12.560 --> 00:35:16.000 or with Postgres on RDS. We've just 552 00:35:16.000 --> 00:35:19.720 released S3 vectors, Amazon S3 vectors. 553 00:35:19.720 --> 00:35:23.320 So you can even store your vectors on S3 as object 554 00:35:23.320 --> 00:35:27.130 stores, but and use that which is extremely cost effective 555 00:35:27.130 --> 00:35:30.730 if you compare it to traditional vector stores. Because traditional vector 556 00:35:30.730 --> 00:35:34.370 stores have to be. They're basically database servers, servers 557 00:35:35.170 --> 00:35:38.730 and they have to run and they cost money. And S3 558 00:35:38.730 --> 00:35:42.530 vectors stores your vectors on S3. So you only 559 00:35:42.530 --> 00:35:45.690 pay for storage, not for a machine that's running all the time. You pay for 560 00:35:45.690 --> 00:35:49.370 storage and then of course for access. So it's, you 561 00:35:49.370 --> 00:35:52.610 can, you can even. You can save up to 90% of the cost 562 00:35:53.010 --> 00:35:56.410 compared to. To database 563 00:35:57.450 --> 00:36:00.810 based vector stores. And then 564 00:36:01.530 --> 00:36:05.370 we have Bedrock Agents, which is out of a box system that provides 565 00:36:05.530 --> 00:36:09.210 agents in a fairly opinionated 566 00:36:09.210 --> 00:36:12.890 way. So you can just build an agent using Bedrock 567 00:36:12.890 --> 00:36:16.410 agents. We don't have to do that much, 568 00:36:16.490 --> 00:36:20.010 but these agents are self contained inside of 569 00:36:20.010 --> 00:36:23.420 aws. And the third thing, and that is something that's 570 00:36:23.820 --> 00:36:27.340 just in purview since. Since a few weeks at the time of 571 00:36:27.340 --> 00:36:30.860 recording. Maybe ga. When you listen to this episode. GA means general 572 00:36:31.340 --> 00:36:34.540 generally available. It's In a public preview right now, that's 573 00:36:34.540 --> 00:36:38.060 Bedrock, Amazon Bedrock's Agent 574 00:36:38.060 --> 00:36:41.700 Core and that's family of services. That's very 575 00:36:41.700 --> 00:36:45.260 interesting because it, it gives you all the 576 00:36:45.260 --> 00:36:48.890 individual capability capabilities as building 577 00:36:48.890 --> 00:36:52.570 blocks that you can use. That is, it 578 00:36:52.570 --> 00:36:56.370 has access to Bedrock models, obviously, but it also, you can 579 00:36:56.370 --> 00:36:59.770 also use this, use it to use Bedrocks, any, 580 00:37:00.570 --> 00:37:04.330 sorry to use models anywhere. So you can 581 00:37:04.330 --> 00:37:07.970 also use this, use it with OpenAI, with llama API, 582 00:37:07.970 --> 00:37:11.530 with your own ollama or whatnot. And it's 583 00:37:11.850 --> 00:37:15.610 provider, no, it's framework agnostic. So you can deploy 584 00:37:15.850 --> 00:37:19.490 your CREWAI agents or your langdref agents. 585 00:37:19.730 --> 00:37:22.610 You don't have to do it the AWS way. 586 00:37:24.210 --> 00:37:27.250 The next capability is memory, because very often it's important 587 00:37:27.810 --> 00:37:31.530 to maintain information across sessions. So when 588 00:37:31.530 --> 00:37:35.250 I talk to the agent right now I wanted to retain information 589 00:37:35.330 --> 00:37:39.010 about previous conversations. And that's a capability. It's called 590 00:37:39.010 --> 00:37:42.730 Agent Core Memory. Again, Agent Core memory can 591 00:37:42.730 --> 00:37:46.450 just be plugged into an agent that you run on Agent Core, but it can 592 00:37:46.450 --> 00:37:50.190 also just be plugged into an agent that you run somewhere else. Again, 593 00:37:50.510 --> 00:37:53.910 it's provider and framework agnostic. The third 594 00:37:53.910 --> 00:37:57.470 capability is Tools. So right now, 595 00:37:57.550 --> 00:38:00.830 as of now, we provide direct access to 596 00:38:02.190 --> 00:38:05.230 code environment sandbox, which is completely isolated. 597 00:38:06.030 --> 00:38:09.630 So if you, if, if your agent creates code or if you're the 598 00:38:09.630 --> 00:38:13.150 user of your agent's agent sends code, the 599 00:38:13.150 --> 00:38:16.880 agent can then just use one of these 600 00:38:16.880 --> 00:38:20.640 sandboxes to run that code in a completely secure 601 00:38:20.640 --> 00:38:24.360 and isolated environment. And right now it provides a 602 00:38:24.360 --> 00:38:27.640 python runtime and typescript, most likely more in the future. 603 00:38:28.600 --> 00:38:32.040 It also provides browser access to a browser 604 00:38:32.280 --> 00:38:36.040 environment so that your agent can use the Internet again 605 00:38:36.920 --> 00:38:40.600 in an isolated environment. Another capability, of 606 00:38:40.600 --> 00:38:44.450 course, is Security Identity. You can do everything, 607 00:38:45.010 --> 00:38:48.610 of course, using IAM identity and access management with 608 00:38:48.610 --> 00:38:52.290 AWS, but you can also use OAuth with 609 00:38:52.850 --> 00:38:56.450 any kind of OAuth provider or your corporate 610 00:38:56.450 --> 00:38:59.730 identity provider, your commercial identity provider that 611 00:39:00.130 --> 00:39:03.410 you're using anyway to make sure only the people who 612 00:39:03.730 --> 00:39:07.530 should access your agents are actually able to access 613 00:39:07.530 --> 00:39:11.100 your agents. And then we have, and I 614 00:39:11.100 --> 00:39:13.660 realize I've been talking a lot and it's a lot of stuff. I'm going to 615 00:39:13.660 --> 00:39:17.500 summarize that in a second. Then one of two more things. Observability 616 00:39:17.580 --> 00:39:21.300 out of the box using OpenTelemetry, so you can use your 617 00:39:21.300 --> 00:39:24.220 existing observability stack if you want. 618 00:39:24.860 --> 00:39:27.260 And Gateway 619 00:39:28.300 --> 00:39:31.660 Agent Core Gateway allows you to just wrap any 620 00:39:31.660 --> 00:39:35.500 API that you may already have and expose it as 621 00:39:35.500 --> 00:39:38.830 an MCP server, including discovery, 622 00:39:39.070 --> 00:39:41.470 including even the ability to 623 00:39:45.630 --> 00:39:49.310 sell your Own agent or your own MCP server on the 624 00:39:49.310 --> 00:39:52.750 AWS marketplace to other AWS customers if you want. 625 00:39:53.150 --> 00:39:56.950 So in summary, what Agent Core provides is all the building blocks that 626 00:39:56.950 --> 00:39:59.310 you might need to build an agent memory, 627 00:40:00.590 --> 00:40:04.310 a runtime for the agents and the MCP servers, a gateway. 628 00:40:04.310 --> 00:40:07.670 If you already have your MCP or your server and just want to wrap it 629 00:40:07.670 --> 00:40:11.290 as MCP uses, has identity, it has 630 00:40:11.370 --> 00:40:14.410 observability, tools and memory. 631 00:40:14.970 --> 00:40:16.970 That's what we. It's a lot, I realize. 632 00:40:18.970 --> 00:40:22.690 Yeah, it is. I 633 00:40:22.690 --> 00:40:26.490 was wondering for our audience, have you built an AI prototype that 634 00:40:26.570 --> 00:40:30.210 almost made it into production? What blocked you 635 00:40:30.210 --> 00:40:32.970 tag us with your war story? 636 00:40:33.930 --> 00:40:36.650 We'll be back after a very, very short ad break 637 00:40:42.470 --> 00:40:46.070 then is some startups are using multiple 638 00:40:46.070 --> 00:40:49.830 foundation models at once. What's AWS's approach 639 00:40:49.910 --> 00:40:53.670 to multiple model orchestration and 640 00:40:53.670 --> 00:40:55.750 how do you manage that securely? 641 00:40:58.630 --> 00:41:02.230 Multimodal usage is one of the core premises 642 00:41:02.310 --> 00:41:04.950 because we say it doesn't make sense to use one for everything, 643 00:41:06.720 --> 00:41:10.240 which is why we started bedrock the way we did in the first place. It's 644 00:41:10.240 --> 00:41:14.000 not one model that you can use. You can use models from 645 00:41:14.000 --> 00:41:17.840 many different providers with many different capabilities. Because 646 00:41:18.560 --> 00:41:22.360 in many use cases you may want to use a general purpose large language 647 00:41:22.360 --> 00:41:25.960 model, but you also may want to use a model to create your embeddings or 648 00:41:25.960 --> 00:41:29.800 to create images. Could be from a completely different provider. Or you want to 649 00:41:29.800 --> 00:41:33.630 use a reasoning model for involved tasks and a 650 00:41:33.630 --> 00:41:37.310 much less expensive small model for basic 651 00:41:37.310 --> 00:41:40.310 tasks like summarization or classification. 652 00:41:41.030 --> 00:41:44.590 And these can be from different providers. Then there's models that are 653 00:41:44.590 --> 00:41:48.230 specialized in translation, language, translation of languages. 654 00:41:48.230 --> 00:41:51.990 There's models that may be specialized in code, creating 655 00:41:51.990 --> 00:41:54.950 code. So Bedrock has already always 656 00:41:55.590 --> 00:41:58.950 Bedrock and AWS has always looked at it through the lens of different 657 00:41:58.950 --> 00:42:02.510 customers need different things, different models and individual 658 00:42:02.510 --> 00:42:06.210 customers may need different models models for different use cases 659 00:42:06.290 --> 00:42:09.930 or even inside the same use case. So I wouldn't 660 00:42:09.930 --> 00:42:12.770 start that way. If I would just build a prototype, 661 00:42:13.810 --> 00:42:17.330 I wouldn't start with multiple models, I would start with one to start 662 00:42:17.330 --> 00:42:20.770 understanding the moving parts and how it works and what and the limitations. 663 00:42:21.330 --> 00:42:24.770 But you can certainly use multiple models, and 664 00:42:25.170 --> 00:42:28.290 in any kind of production application I probably would 665 00:42:28.770 --> 00:42:32.070 because that helps me reduce cost, 666 00:42:33.030 --> 00:42:36.750 reduce latency, because the larger the model, the longer it 667 00:42:36.750 --> 00:42:40.150 takes for the model to respond, to take 668 00:42:40.790 --> 00:42:42.790 these things into account. So yes, 669 00:42:44.070 --> 00:42:47.910 multimodal, using different models, even from different providers, 670 00:42:48.070 --> 00:42:51.110 is certainly something that I would suggest looking into 671 00:42:52.550 --> 00:42:55.830 once you have your basic use case down 672 00:42:56.230 --> 00:42:59.950 and once you go into well, how can I optimize cost how can I 673 00:42:59.950 --> 00:43:03.760 optimize latency? Is there a specialized model that 674 00:43:03.760 --> 00:43:07.080 helps me with certain tasks inside the workflow? 675 00:43:07.640 --> 00:43:11.360 How do we make sure it's secure? Well, just like everything 676 00:43:11.360 --> 00:43:15.040 on aws, everything goes through the 677 00:43:15.040 --> 00:43:18.360 AWS API. So every model invocation goes through the 678 00:43:18.440 --> 00:43:21.840 AWS API, which is protected through iam, through 679 00:43:21.840 --> 00:43:25.240 identity and access management. So you can have 680 00:43:25.400 --> 00:43:28.940 really fine grained mechanism to say 681 00:43:28.940 --> 00:43:32.540 who or which service or which third 682 00:43:32.540 --> 00:43:35.860 party or which agent is allowed, which process is 683 00:43:35.860 --> 00:43:39.380 allowed to interact with individual models with intro 684 00:43:39.620 --> 00:43:43.380 with individual data stores or tools that you provide. 685 00:43:45.140 --> 00:43:48.500 For our audience. I was wondering what's one tool or 686 00:43:48.500 --> 00:43:51.860 pattern that help you finally scale your project? 687 00:43:52.260 --> 00:43:55.780 Shared on threads or LinkedIn and taxpayer o 688 00:43:56.460 --> 00:44:00.140 Dennis, what's your take on LLM ops or 689 00:44:00.140 --> 00:44:03.420 GenIML ops? Is it the same as 690 00:44:03.420 --> 00:44:06.300 traditional ML Ops or something new? 691 00:44:07.260 --> 00:44:10.780 I'm getting into hot waters when I start 692 00:44:10.860 --> 00:44:14.700 talking about that because it's 693 00:44:14.700 --> 00:44:18.420 not the same. So. And I, I'm not sure if. What did 694 00:44:18.420 --> 00:44:21.420 you say? Gen AI Ops, LLM Ops, 695 00:44:22.220 --> 00:44:25.840 ML Ops OP definitions 696 00:44:25.840 --> 00:44:29.600 are in flow. Well, I'm pretty sure there is a definition for MLOps 697 00:44:30.480 --> 00:44:33.680 and there's probably also a definition for LLM ops. 698 00:44:34.800 --> 00:44:38.560 But the thing is that 699 00:44:39.119 --> 00:44:42.760 with the democratization of generative 700 00:44:42.760 --> 00:44:46.600 AI since the Chat GPT moment, effectively when everybody wants to 701 00:44:46.600 --> 00:44:48.670 build on top of generative AI, 702 00:44:50.820 --> 00:44:54.180 there's a new kind of discipline 703 00:44:54.580 --> 00:44:58.100 emerging which sits at the intersection of 704 00:44:58.500 --> 00:45:00.980 software developers and 705 00:45:02.980 --> 00:45:06.100 data scientists and 706 00:45:06.500 --> 00:45:10.260 machine learning engineers. And that's what's 707 00:45:11.300 --> 00:45:15.100 emerging as AI engineers. That's the term 708 00:45:15.100 --> 00:45:18.740 that's being used increasingly for this, where you 709 00:45:18.740 --> 00:45:22.270 don't go build the models yourself, 710 00:45:23.070 --> 00:45:26.670 you don't even necessarily fine tune the models and then 711 00:45:26.670 --> 00:45:30.350 deploy the models somewhere and run them. You use 712 00:45:30.510 --> 00:45:34.030 models, you build applications that use 713 00:45:34.110 --> 00:45:37.870 these models and traditional approaches to build an 714 00:45:37.870 --> 00:45:41.390 application that has both the intelligence 715 00:45:41.710 --> 00:45:45.070 of a language model or any kind of generative AI model 716 00:45:45.340 --> 00:45:48.140 and the, the, the 717 00:45:48.780 --> 00:45:52.180 capabilities of the piece of software you build. So the AI 718 00:45:52.180 --> 00:45:55.820 engineer understands how 719 00:45:56.300 --> 00:46:00.140 LLMs works, understands the limitations, understands the difference 720 00:46:00.140 --> 00:46:03.580 between models, but the AI engineer usually 721 00:46:03.580 --> 00:46:07.420 doesn't deploy these models, doesn't build and train these models. 722 00:46:07.420 --> 00:46:11.020 That's what happens, that's what the ML engineer does. 723 00:46:13.060 --> 00:46:16.740 And when it comes to operations, I think it's very 724 00:46:16.740 --> 00:46:20.500 similar. The LLM ops, or 725 00:46:20.660 --> 00:46:23.700 more more specifically defined probably 726 00:46:23.940 --> 00:46:27.260 ML Ops is really the 727 00:46:27.260 --> 00:46:30.820 operational aspect of building and deploying and running 728 00:46:30.820 --> 00:46:34.660 models, training models and everything around 729 00:46:34.660 --> 00:46:38.380 that. And the gen AI Ops, or 730 00:46:38.380 --> 00:46:40.020 AIOps if you will, 731 00:46:42.280 --> 00:46:45.880 is DevOps. But 732 00:46:45.960 --> 00:46:49.560 now it includes AI 733 00:46:50.040 --> 00:46:53.560 as another very important component 734 00:46:53.720 --> 00:46:57.240 which requires its own 735 00:46:58.840 --> 00:47:02.680 capabilities like evaluation. So you test AI 736 00:47:02.680 --> 00:47:04.280 differently than you test 737 00:47:06.530 --> 00:47:10.210 a front end or then you do load tests 738 00:47:10.690 --> 00:47:14.330 on an environment. You have to approach it a 739 00:47:14.330 --> 00:47:18.130 slightly different way. It's the same thing in terms 740 00:47:18.210 --> 00:47:21.530 of what you have to do. You have to make sure it works. Every time 741 00:47:21.530 --> 00:47:25.170 you change something in your application, you have to make sure that it still works, 742 00:47:25.170 --> 00:47:29.010 that you don't have any regression, that you didn't introduce any bugs. 743 00:47:30.370 --> 00:47:34.050 Now there's a new class of regression, there's a new 744 00:47:34.050 --> 00:47:37.530 class of bugs, there's a new class 745 00:47:37.610 --> 00:47:41.410 of thing that may introduce latency or that may 746 00:47:41.410 --> 00:47:45.210 introduce additional cost. And that class is based 747 00:47:45.210 --> 00:47:48.850 on the integration of AI. And in 748 00:47:48.850 --> 00:47:52.170 my opinion, the operational aspect of 749 00:47:53.210 --> 00:47:56.810 this becomes native part of DevOps over 750 00:47:56.810 --> 00:48:00.580 time. There are a lot of tools right now, there will 751 00:48:00.580 --> 00:48:04.340 be more tools in the future, but it's very different 752 00:48:04.420 --> 00:48:07.780 from what the data science and the ML engineer. Have you 753 00:48:08.180 --> 00:48:12.020 seen any counterintuitive success story 754 00:48:12.580 --> 00:48:16.340 where there was less tech and that actually led to better 755 00:48:16.340 --> 00:48:19.340 performance of the AI in production? The most 756 00:48:19.340 --> 00:48:22.860 counterintuitive thing is something that I see fairly often really 757 00:48:22.860 --> 00:48:25.860 is when you approach it saying, well, 758 00:48:26.500 --> 00:48:30.140 let's do this with AI and you realize actually we don't need AI for 759 00:48:30.140 --> 00:48:33.780 this, or let's build an AI agent because 760 00:48:33.780 --> 00:48:36.900 everybody's talking about agents right now, which is a good thing because 761 00:48:37.620 --> 00:48:41.340 it's an evolving space. But you realize actually I don't really need an 762 00:48:41.340 --> 00:48:44.740 agent because I can simply use an LLM 763 00:48:45.620 --> 00:48:49.220 for this. So the most counterintuitive 764 00:48:49.300 --> 00:48:53.060 is something that I have seen throughout my entire career. And software 765 00:48:53.060 --> 00:48:56.740 engineering is less complex, very often 766 00:48:57.540 --> 00:49:01.220 is more effective. So whenever you build something, 767 00:49:01.220 --> 00:49:04.620 I encourage you to try and to experiment with AI and AI 768 00:49:04.620 --> 00:49:07.540 agent, but I also 769 00:49:08.340 --> 00:49:12.140 encourage you to not try to just solve 770 00:49:12.140 --> 00:49:15.940 everything with AI. And that may be counterintuitive advice, 771 00:49:15.940 --> 00:49:18.890 but it has always been sound advice in my experience. Experience. 772 00:49:22.170 --> 00:49:25.770 Last question for us here in the second interview 773 00:49:26.010 --> 00:49:29.690 and thank you for sticking around with me because we're together here in a session 774 00:49:29.690 --> 00:49:33.330 for more than two hours now. Zoom out for us, Dennis. 775 00:49:33.330 --> 00:49:37.090 What's the future of AI architecture? In something like two to 776 00:49:37.090 --> 00:49:40.610 three years, do you see MCP and ancient frameworks 777 00:49:40.610 --> 00:49:44.330 becoming the new standard? I have no 778 00:49:44.330 --> 00:49:44.730 idea. 779 00:49:47.860 --> 00:49:51.700 Literally, I have no idea. If you look back through 780 00:49:51.700 --> 00:49:55.140 the last two to three years since the ChatGPT 781 00:49:55.620 --> 00:49:56.500 moment, basically 782 00:49:59.940 --> 00:50:02.340 everything has changed so dramatically. 783 00:50:03.620 --> 00:50:07.180 The technology, the infrastructure, the capabilities of the 784 00:50:07.180 --> 00:50:10.980 models themselves, the availability, 785 00:50:11.140 --> 00:50:14.100 the open source framework, the work that the community is doing, 786 00:50:15.100 --> 00:50:17.740 the many, many startups that are around 787 00:50:18.780 --> 00:50:22.580 Certainly many still try to solve old problems with new tools. But there are 788 00:50:22.580 --> 00:50:26.100 also so many niches where something incredible is actually 789 00:50:26.100 --> 00:50:28.620 happening and there's so much innovation happening. 790 00:50:31.100 --> 00:50:34.340 I don't even know I'm going to go on vacation a week from now for 791 00:50:34.340 --> 00:50:37.740 three weeks. I don't even know what the world will look like when I'm back. 792 00:50:39.020 --> 00:50:42.760 It's really hard. I think agents. Well, first 793 00:50:42.760 --> 00:50:46.200 of all, AI is not like a flu. It won't go away. 794 00:50:46.680 --> 00:50:50.440 It's going to stick around. Agentic 795 00:50:50.440 --> 00:50:54.200 AI, it's being hyped right now. But I also think 796 00:50:54.440 --> 00:50:58.200 it is a very important topic that either sticks around 797 00:50:58.200 --> 00:51:01.480 or evolves into something even more 798 00:51:03.480 --> 00:51:06.360 capable. The thing is, 799 00:51:08.530 --> 00:51:12.170 the thing is the best, the best time to get in, to get 800 00:51:12.170 --> 00:51:15.730 involved is now because it's never, it's never going to be as 801 00:51:15.730 --> 00:51:19.410 simple as it is today. And I realize it is really hard. 802 00:51:19.970 --> 00:51:23.810 I'm able, I'm. I'm lucky to be able to work with this stuff 803 00:51:23.890 --> 00:51:27.330 every day, all day long. And I'm still overwhelmed. I'm still 804 00:51:27.330 --> 00:51:30.890 overwhelmed. I've subscribed to so many newsletters and 805 00:51:30.890 --> 00:51:34.490 there's so much news and so many tools to look at and so many frameworks 806 00:51:34.490 --> 00:51:38.310 and so many. I don't know. I don't even know where to start until 807 00:51:38.310 --> 00:51:41.750 I realized most of these newsletters and most of the 808 00:51:41.750 --> 00:51:45.430 experts that are around all of a sudden, they just copy from each 809 00:51:45.430 --> 00:51:49.230 other. Many of them, not all of them, but many really just copy from each 810 00:51:49.230 --> 00:51:50.870 other. And I, 811 00:51:51.270 --> 00:51:55.110 I'm fairly convinced that many of them 812 00:51:55.110 --> 00:51:58.790 really are just AI tools creating content on the 813 00:51:58.790 --> 00:52:02.150 socials, in newsletters and so forth. So it's really hard 814 00:52:02.750 --> 00:52:04.270 to, it's really hard to 815 00:52:06.270 --> 00:52:09.230 distill the actual signal from the noise right now. 816 00:52:10.750 --> 00:52:14.470 But at the same time, it has never been as easy as today 817 00:52:14.470 --> 00:52:17.630 because it's only getting more complicated. It's only going to be more 818 00:52:18.750 --> 00:52:21.950 so for you. Really important is to get started now 819 00:52:22.670 --> 00:52:26.510 and at the same time try to understand the fundamentals, not necessarily 820 00:52:26.590 --> 00:52:30.270 the math behind these models. You don't need a PhD in science 821 00:52:30.880 --> 00:52:34.560 or in math or anything. I, I certainly don't. I'm a developer. I don't 822 00:52:34.560 --> 00:52:38.280 understand AI, to be, to be honest. But 823 00:52:38.280 --> 00:52:41.040 what I do understand very well by now is 824 00:52:41.920 --> 00:52:45.600 how can I use AI in software application? 825 00:52:45.840 --> 00:52:49.520 What impact does it have on the capabilities of what I build, but also 826 00:52:49.520 --> 00:52:53.280 what impact does it have on the way I work? That's two different 827 00:52:53.360 --> 00:52:57.180 levels. And I'm, I'm able to do 828 00:52:57.180 --> 00:53:00.780 that because I did the work to at 829 00:53:00.780 --> 00:53:04.020 least understand the fundamentals of 830 00:53:04.580 --> 00:53:08.420 what these models actually are and how 831 00:53:08.580 --> 00:53:12.180 they work in terms of their capabilities. So 832 00:53:12.260 --> 00:53:14.180 why do they get things wrong? 833 00:53:16.260 --> 00:53:19.900 Why do they have what we call hallucinations? Why do they 834 00:53:19.900 --> 00:53:23.360 have a hard time doing basic 835 00:53:23.360 --> 00:53:26.960 math while being able to talk for 836 00:53:27.600 --> 00:53:31.200 hours? So these are the things. And I 837 00:53:31.200 --> 00:53:33.360 invite you to listen to 838 00:53:34.560 --> 00:53:38.159 Joe's podcast. I invite you to have a look at the stuff that we put 839 00:53:38.159 --> 00:53:42.000 out at aws, to the things that I put out on the socials. 840 00:53:42.560 --> 00:53:45.520 Follow me on LinkedIn. It's just Dennis Troup. I think 841 00:53:45.920 --> 00:53:49.360 Joe's going to put my context into the details. Ask 842 00:53:49.360 --> 00:53:52.090 questions, talk to people. Figure out, 843 00:53:53.450 --> 00:53:56.730 figure out how this stuff works. 844 00:53:57.290 --> 00:54:00.570 Experiment, play around with it. Don't be stupid. 845 00:54:00.890 --> 00:54:04.410 Don't connect a random piece of AI to your email. 846 00:54:07.370 --> 00:54:10.970 Don't put something out on the Internet and then run up a bill because 847 00:54:10.970 --> 00:54:14.610 somebody. DDoS is you. Experiment 848 00:54:14.610 --> 00:54:18.410 in an isolated environment, maybe inside of an AWS account or 849 00:54:18.410 --> 00:54:22.090 on your local machine where everything's isolated and protected. 850 00:54:22.250 --> 00:54:25.430 You don't have to worry. Worry about external 851 00:54:26.070 --> 00:54:29.750 influences and maybe threats. Experiment, play around with it 852 00:54:29.750 --> 00:54:33.590 and at the same time think about, think 853 00:54:33.590 --> 00:54:37.430 about the things that you might want to solve for 854 00:54:37.430 --> 00:54:41.190 yourself. Think about, think about the things that you 855 00:54:41.830 --> 00:54:45.190 need to do manually, manually every day 856 00:54:45.750 --> 00:54:49.550 because it was too hard to automate or it was too impossible. It was 857 00:54:49.550 --> 00:54:53.350 impossible to automate or it was too costly, or you just didn't get around 858 00:54:53.350 --> 00:54:56.930 to automating it. Maybe AI can help you 859 00:54:57.410 --> 00:55:00.610 with a small problem that you have every day, 860 00:55:01.250 --> 00:55:05.090 that you're trying to solve every day, and it bothers you 861 00:55:05.090 --> 00:55:08.850 and it's so annoying. That is what I did. 862 00:55:08.930 --> 00:55:12.730 That's how I got started. That's how I learned. I looked at what I'm 863 00:55:12.730 --> 00:55:15.970 doing every day and there's so much stuff that I never got around to doing. 864 00:55:15.970 --> 00:55:19.530 And it was. I was complaining about them all the time and it bothered me 865 00:55:19.530 --> 00:55:23.370 all the time. And all of a sudden I realized I can build a small 866 00:55:23.370 --> 00:55:27.130 agent that just does it for me. And it doesn't even need 867 00:55:27.130 --> 00:55:30.850 to connect to sensitive data or it doesn't even connect to 868 00:55:30.850 --> 00:55:33.930 the Internet or anything. It is just a small 869 00:55:34.650 --> 00:55:38.370 CLI tool that I run automatically 870 00:55:38.370 --> 00:55:41.290 every day and it takes care of some stuff for me. 871 00:55:42.170 --> 00:55:45.130 It pulls some statistics or 872 00:55:47.730 --> 00:55:51.370 looks. If there was some new conversations on Slack that I need to know about, 873 00:55:52.330 --> 00:55:56.010 these are the small things that I built. And by building these small things, 874 00:55:56.010 --> 00:55:59.530 I. I learned how they work, I learned how they fail. 875 00:56:00.010 --> 00:56:03.730 I learned about all the things that can go wrong. And then I 876 00:56:03.730 --> 00:56:07.490 started being able to build larger things, to build more 877 00:56:07.490 --> 00:56:10.930 complex applications, to build actual agents, actual 878 00:56:10.930 --> 00:56:14.740 agentic systems that I now run for 879 00:56:14.740 --> 00:56:18.420 more and more things. I have to admit though, I'm not running anything in 880 00:56:18.420 --> 00:56:22.180 production because I'm not building production software anymore. I haven't 881 00:56:22.180 --> 00:56:25.900 for a few years, unfortunately. But the great thing is, in 882 00:56:25.900 --> 00:56:29.019 my role, I get to experiment with that stuff a lot. 883 00:56:30.700 --> 00:56:34.380 Dennis, awesome last words. Thank you very much 884 00:56:34.380 --> 00:56:37.740 for being such a good guest and telling us so much about 885 00:56:37.740 --> 00:56:40.460 AI and aws, how they're working together. 886 00:56:41.570 --> 00:56:45.010 Thank you so much for having me. It was a great time. Thank you. 887 00:56:49.810 --> 00:56:53.330 That's all folks. Find more news, streams, 888 00:56:53.570 --> 00:56:54.610 events and 889 00:56:54.610 --> 00:56:59.010 interviews@www.startuprad.IO. 890 00:56:59.410 --> 00:57:01.650 remember, sharing is caring. 891 00:57:06.300 --> 00:57:14.620 Sam.