From b0a8492f3fcd164a99108184203c1b7ead5ecf79 Mon Sep 17 00:00:00 2001 From: Logbot <40303173+changelogbot@users.noreply.github.com> Date: Sat, 7 Dec 2024 07:43:46 +0000 Subject: [PATCH] Apply standardised formatter to practical-ai-297.md This commit was automatically generated by the formatter github action which ran the src/format.js script Files changed: practicalai/practical-ai-297.md --- practicalai/practical-ai-297.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/practicalai/practical-ai-297.md b/practicalai/practical-ai-297.md index 9effdf65..e1f37322 100644 --- a/practicalai/practical-ai-297.md +++ b/practicalai/practical-ai-297.md @@ -30,7 +30,7 @@ **Chris Benson:** I'm just kind of amazed that people are surprised by that these days. It's like, you're going to see this stuff everywhere. And so - yeah, okay, iconic thing, I got it... But yeah, I mean, I would have been almost surprised if they hadn't. -**Daniel Whitenack:** \[00:07:39.02\] Yeah. And yeah, if you just search for Coca-Cola ad, I think it's the "Real Magic Holiday" ad, which is also a bit ironic that they titled it Real Magic... When it's definitely not real. But yeah, you can watch it. It's pretty interesting. Whether or not it's really, really good ad material, it's I think a sign that for sure AI-generated video is here with us for the future. +**Daniel Whitenack:** \[07:39\] Yeah. And yeah, if you just search for Coca-Cola ad, I think it's the "Real Magic Holiday" ad, which is also a bit ironic that they titled it Real Magic... When it's definitely not real. But yeah, you can watch it. It's pretty interesting. Whether or not it's really, really good ad material, it's I think a sign that for sure AI-generated video is here with us for the future. **Chris Benson:** You had a few some months back the actors going on strike... But I just think that it's one of those things we have a long way to go; not just in entertainment, but in most industries, where it's going to -- you're going to see corporate videos that are AI-generated. I've already seen that. I may not have seen the Coke one, but I've seen corporations that are doing it. It's the way it is now. @@ -52,7 +52,7 @@ And yeah... Interesting. We've seen - maybe just as a reminder, we've seen the B **Chris Benson:** Often in the same sentence. -**Daniel Whitenack:** \[00:11:42.06\] Yeah, maybe so. But one of the things I think that has been kind of promised, maybe as a part of just undoing some of the things of the Biden administration, which I think we can expect more generally, is a promise to repeal the executive order on AI, among probably other things... And I think citing the hindrance of innovation, this kind of anti-regulatory take on a lot of things... So there's a promise to repeal that. I'm not enough of a lawyer/politician/political analyst to know what exactly that undoes, because the executive order, I think, kind of has its tentacles in a variety of things that it touches, that are maybe not immediately related to the executive order, like the NIST, AI risk frameworks, and those sorts of things... So I don't know exactly how that works out. Maybe that's a point of confusion on my part. +**Daniel Whitenack:** \[11:42\] Yeah, maybe so. But one of the things I think that has been kind of promised, maybe as a part of just undoing some of the things of the Biden administration, which I think we can expect more generally, is a promise to repeal the executive order on AI, among probably other things... And I think citing the hindrance of innovation, this kind of anti-regulatory take on a lot of things... So there's a promise to repeal that. I'm not enough of a lawyer/politician/political analyst to know what exactly that undoes, because the executive order, I think, kind of has its tentacles in a variety of things that it touches, that are maybe not immediately related to the executive order, like the NIST, AI risk frameworks, and those sorts of things... So I don't know exactly how that works out. Maybe that's a point of confusion on my part. **Chris Benson:** Yeah, my concern is - you know, there are some things that I think if you didn't just have a knee-jerk reaction to anti-anything that Biden did, that there are actually some things that the current administration and the incoming administration should be able to agree on. And one of those that's not AI, just as an example, is the CHIPS Act, which is kind of trying to bring semiconductor capabilities back online in the U.S. And if you are kind of -- if you're an administration that's anti-China, or in the China/Taiwan concern, which Trump has said he is, you would think that that's actually something that both sides of the aisle could agree to. But he's also said he's going to repeal the CHIPS Act as well. And I fear that the executive order, since it is something he can repeal with just the stroke of a pen, might suffer that. And yet, I think that he would be making a mistake regarding his own administration. I think that would create problems. @@ -62,7 +62,7 @@ And yeah... Interesting. We've seen - maybe just as a reminder, we've seen the B **Daniel Whitenack:** Yeah. What is your take on the potential perspectives on open source, or closed, in the Trump administration? Any thoughts on that in terms of how that may be influenced one way or the other? -**Chris Benson:** \[00:16:03.13\] I don't really know at that point. I think it depends on who's in the cabinet, potentially, and probably more specifically who's working on staff at the White House, and what their takes on it are. And I couldn't speak to that. +**Chris Benson:** \[16:03\] I don't really know at that point. I think it depends on who's in the cabinet, potentially, and probably more specifically who's working on staff at the White House, and what their takes on it are. And I couldn't speak to that. **Daniel Whitenack:** Yeah, I've seen a mix of takes on that... I think there's one perspective that while China has benefited greatly from open source AI, not only have they been model builders and actually producing a lot of technology in the AI space, but they've also benefited a lot from meta and US AI technology... So there's kind of one side of it that would be "Well, let's lock that down", in the same way that they might try to restrict exports of other things, or that sort of thing. @@ -70,7 +70,7 @@ But I've also seen the other take on the fact that you're basically anti-regulat **Chris Benson:** Yeah. I think there's a lot of ambiguity at the moment, because if you look at traditional conservatism, if you look at Ronald Reagan... Because a lot of Republicans really look back to that. Open trade is huge, but we're also having Trump talking about tariffs. That's been the news of the week. And that's kind of the antithesis of that. So it's kind of hard to figure out where the ball's going to land on those. -**Break**: \[00:18:06.26\] +**Break**: \[18:06\] **Daniel Whitenack:** One thing that was kind of brought up in the midst of this talk of the Trump administration and AI is this sort of AI and China discussion, where there's a thought AI is kind of thriving in China, and maybe China is pulling ahead in AI... I know we've talked about this on the show before. There's kind of this discussion of China and AI every time policy decisions are discussed on this show, and kind of factors in... And one of those things that I think is relevant is just the dominance of Qwen-based models in recent times. So if people aren't aware, one of the things that I think is interesting to follow recently is Alibaba's Qwen family of models. That's spelled Q-W-E-N, Qwen. The latest of these is the Qwen 2.5 model family. And generally, these Qwen 2.5 models are quite impressive. They generally top the open LLM leaderboards in various categories. You'll see them in the top spots. @@ -82,7 +82,7 @@ Any interesting takes on that, Chris, in terms of how you've seen the model land **Daniel Whitenack:** Yeah, and maybe this is an interesting little diversion here, because I think some people don't understand the potential security risks as associated with this sort of model. So we say it's a model produced in China, some people would be uncomfortable because of China's use of data, or ways that they would use this technology... But if we look at the model itself - so you can go to Hugging Face and just search for Qwen models. So the Qwen models are open in the sense that you can go to Hugging Face, it's a repository of models... You can literally go to the Qwen model. You can download the weights of the model, and load that model into infrastructure that you control. So this model, when you think of the model, is composed of parameters and model code that runs that model. So if you go to the model on Hugging Face, you can download that. -\[00:23:48.11\] Now, similar to like if you were to go to GitHub, and you look at all of the repositories on GitHub, some of those repositories on GitHub will have security considerations or licenses that won't allow you to use them, or sources that you don't trust... It's a little bit interesting here, because these models are kind of loaded into code that is maintained by Hugging Face, the Transformers library, or other serving frameworks. So if you're self-hosting the model, meaning you're pulling the model down from Hugging Face, the files, and you're loading it into code that can serve that model, that model serving is under your control and you're downloading those files, meaning you can inspect them. It doesn't mean there's no security vulnerabilities associated with them, but ultimately, all of that is under your control. +\[23:48\] Now, similar to like if you were to go to GitHub, and you look at all of the repositories on GitHub, some of those repositories on GitHub will have security considerations or licenses that won't allow you to use them, or sources that you don't trust... It's a little bit interesting here, because these models are kind of loaded into code that is maintained by Hugging Face, the Transformers library, or other serving frameworks. So if you're self-hosting the model, meaning you're pulling the model down from Hugging Face, the files, and you're loading it into code that can serve that model, that model serving is under your control and you're downloading those files, meaning you can inspect them. It doesn't mean there's no security vulnerabilities associated with them, but ultimately, all of that is under your control. That is a different scenario than if you were to connect to an API that is serving the Qwen model, which there are ones from Alibaba and others, where this model is actually hosted as a product of a Chinese company. You're sending your data to that API product, which is then processing your data, and giving you a response back from the model. So I just wanted to emphasize there's kind of these two scenarios here. So one, in one scenario, the security vulnerability is really related to the model files that you're downloading. Is there any security vulnerability in those model files, which there could be? Is there any third-party code that's used when you load those model files, which there could be? And what serving framework are you using to serve them, which could have security vulnerabilities? @@ -102,7 +102,7 @@ In the other case, you're relying on someone else's infrastructure, which isn't Also, there's some recent development... So we're late in November already, but this is I think about a week ago, something like that... Quinn Turbo 1 million was released, a sort of new version of this, which extends the context length of the Qwen 2.5 language models from 128k to 1 million tokens. So that's -- to kind of give a context, some of what's cited is like 150 hours of transcripts, or 30,000 lines of code, or these sorts of things. So lots of context can be put into these models, which is a trend that has continued, and I have my own opinions about, but it does seem to be a trend that continues. -**Chris Benson:** \[00:28:06.22\] Go ahead and share them. You can't hang that out there and not go there now. +**Chris Benson:** \[28:06\] Go ahead and share them. You can't hang that out there and not go there now. **Daniel Whitenack:** Yeah, well, I just think if you think about the typical, the most common enterprise cases that I run across in working with customers, most often these fit these scenarios of what I like to think of as something that could be done by a college-level intern, right? So you have some very clear instructions to do this sort of workflow... And it might be multi-step, it might be a complicated workflow, but it's all -- like, you can break it down in a sequence. There's instructions there. So anecdotally, if you go to a college level intern and you say "Go into the warehouse out back. There's rows and rows of documents. Now do this task for me." Right? That's a much harder thing, with a higher degree of potential failure, than if you go to the warehouse and you find generally the section that's relevant to a task and you say "Hey, look at these couple folders of documents and do the task." You're much more likely to get a better result, and I think these models anecdotally behave similarly... And there's some evidence for this in terms of the forgetting of what's in the middle of the context, which has been observed in academic research... And I'm sure people on this podcast will be like "No, Daniel, that's solved", whatever. It's just my own sort of experience and anecdotes in terms of what has been found to be useful. Yeah, a million tokens is a lot, so... @@ -116,7 +116,7 @@ So, there's one that is from DeepSeek, which previously released a series of rea **Chris Benson:** So let me ask a couple of questions around that. Number one is - you know, we've seen so much in the news about kind of hitting the limit lately. OpenAI has come out, and talked about delays on future models, because they're kind of hitting practical limit... People have left the organization as a result of that... And just in general, that's been the conversation in industry over the last month or two. -\[00:32:14.19\] As we do that, do you think that this is kind of the place that we're going to continue to see models evolving into, where instead of just getting bigger and larger context windows, and the whole thing, all that, always bigger, always better, that we're starting to see these kind of -- these 01 Preview styles, where they are pausing and they're bringing whole new techniques in to tackle certain types of problems? Are we maybe going down that path, as well as others? +\[32:14\] As we do that, do you think that this is kind of the place that we're going to continue to see models evolving into, where instead of just getting bigger and larger context windows, and the whole thing, all that, always bigger, always better, that we're starting to see these kind of -- these 01 Preview styles, where they are pausing and they're bringing whole new techniques in to tackle certain types of problems? Are we maybe going down that path, as well as others? **Daniel Whitenack:** Yeah, yeah. From my perspective, at least, one thing that's happening is the gains that are being made from more data, and larger models, have basically plateaued... Which has been observed. Which means that smaller models that people are doing a lot of work to curate data for, and innovate in terms of their efficiency, are catching up rapidly to the larger models. So what would have been only possible by a 70B model, or a 400B model even six months ago or three months ago, is being done by 7B models or smaller. So you've got this small model trend where these models are actually performing at levels much higher than what was able to be seen before. @@ -124,7 +124,7 @@ And then you have kind of branching out to various both specializations or domai And then I think you will see kind of an attempt to continue to develop new types of fine-tuning and prompting methodologies for things like this deep thinking, and for things like agent-related workflows, which I think people are going to be diving into more. So it may be more about the workflow, the prompting format, the prompting strategy as we move forward for just pure text models, than bigger and better models, bigger and better datasets. -**Break**: \[00:35:01.29\] +**Break**: \[35:01\] **Daniel Whitenack:** Well, Chris, speaking of a couple things that I think are pretty cool, and maybe even practical, that we could share as people are trying to level up things... One which is just fun, which is on my list of things to try this week is something that I've found - or someone pointed me to - which is called Pickle. Granted, I haven't tried this yet. I've actually just found it today. But it's not the Pickle -- if you're a Python programmer, Pickle means something very specific, which is a serialization format. But if you're not a Python programmer - yeah, if you just go to getpickle.ai, this seems like what I've been waiting for for a good long time... @@ -146,7 +146,7 @@ And otherwise -- because most of the time... I don't know about you, Chris, but **Daniel Whitenack:** And I don't know -- let's take a stand-up, for example. An engineering team's stand-up, or something like that. Part of the idea behind such a thing, I think - I'm not a scrum master, but part of the idea would be to also actually hear with your ears other people's update, and maybe that influences... Either they're blocked on something, and you can reply, or it influences... So I'm wondering what this does, if it creates more potential isolation in an already remote work distributed environment. And part of me -- so I have a friend, Mark Sears, shout-out to Mark, if you're listening... He's working on a venture studio called Sprout AI. And one of the things that is one of their theses is that they wanna build technologies with AI that drive people relationally together as people. -\[00:41:52.25\] So the idea, just to give an example, would be like, Chris, you and I, maybe we're friends, we're both busy, we're professionals... And so there's an AI assistant that maybe looks at your calendar and looks at my calendar, and looks at events going on in our town, or things that fit both of our interests, and then messages us both and says "Hey, Thursday night you're both free, and there's this event in your town. Are you guys --" And that's a sort of thing that is cool. It kind of drives people relationally together, it gets them out of their house. +\[41:52\] So the idea, just to give an example, would be like, Chris, you and I, maybe we're friends, we're both busy, we're professionals... And so there's an AI assistant that maybe looks at your calendar and looks at my calendar, and looks at events going on in our town, or things that fit both of our interests, and then messages us both and says "Hey, Thursday night you're both free, and there's this event in your town. Are you guys --" And that's a sort of thing that is cool. It kind of drives people relationally together, it gets them out of their house. I think this idea of sort of embodied AI that would drive people relationally together is very appealing in our day and age, and something that's needed. But I also love the idea of joining meetings with my clone. So I don't know how to bring those together, but... @@ -166,11 +166,11 @@ I think this idea of sort of embodied AI that would drive people relationally to **Daniel Whitenack:** Yeah, interesting. I think it will be interesting to see how people leverage these both ways. And like many things we've seen with this technology, there are opportunities for sort of restorative, positive, redemptive kind of uses of this technology, and there's ways that it can kind of drive us into isolation or create issues. -\[00:45:56.22\] But yeah, along that front of kind of lifestyle-related things happening with AI, I've seen a couple of posts recently related to kind of payments and commerce and shopping in AI... The first of those being a blog post from Stripe, which talks about adding payments to LLM agentic workflows. And I guess there's better tooling now to the Stripe Agent Toolkit, which is - if you go to github.com/stripe/agent-toolkit, you can now kind of plug in Stripe as a tool or as a thing that can be leveraged by AI agents, including those from LangChain, CrewAI, Vercel's AI SDK, which - it's definitely pretty cool. It's that kind of scenario like "Hey AI, I need you to book a rental car for me next week." And obviously, that requires some sort of payment. I could also see it on the other end. Being a business owner right now, I'd love to say "Hey, create a recurring invoice for this customer, for this amount, with these line items, and send it to them with a message saying..." Whatever those things are. There's definitely a room for maybe misuse or problematic things happening here, but certainly very, very interesting to see this side of things advance. +\[45:56\] But yeah, along that front of kind of lifestyle-related things happening with AI, I've seen a couple of posts recently related to kind of payments and commerce and shopping in AI... The first of those being a blog post from Stripe, which talks about adding payments to LLM agentic workflows. And I guess there's better tooling now to the Stripe Agent Toolkit, which is - if you go to github.com/stripe/agent-toolkit, you can now kind of plug in Stripe as a tool or as a thing that can be leveraged by AI agents, including those from LangChain, CrewAI, Vercel's AI SDK, which - it's definitely pretty cool. It's that kind of scenario like "Hey AI, I need you to book a rental car for me next week." And obviously, that requires some sort of payment. I could also see it on the other end. Being a business owner right now, I'd love to say "Hey, create a recurring invoice for this customer, for this amount, with these line items, and send it to them with a message saying..." Whatever those things are. There's definitely a room for maybe misuse or problematic things happening here, but certainly very, very interesting to see this side of things advance. **Chris Benson:** It is. I think it's a great thing, personally, the concept of an agent. I know it'll take people time to trust it and get used to it, but I know in our household at this point we tend to buy our groceries and have them delivered and stuff, because we're busy and doing stuff... And a lot of times it's the same stuff as you bought last week, but maybe with a few changes, because you're planning a different type of meal at some point during the week... And I think if you can combine the agent with the payment capability, and have the ability to kind of just smooth your life in that way... I know our family would love that. My wife would absolutely. She'd go nuts for it if that was available. She'd be like "Yup, I'm offloading that. Agent gets it all." -**Daniel Whitenack:** \[00:48:17.23\] There's another -- I don't know if they're using the Stripe API under the hood, but there's another entrant into this, which is Perplexity now offers a sort of shopping assistant with an actual experience behind it kind of built in. So you have the ability to put in like "Hey, I'm doing this project, and I'm wanting to do this and that. What are the items that I need? Help me kind of shop for those." I think that's kind of the vibe. And there's a search that happens, obviously, and it's plugged into various products... In this case, they have a merchant program, which definitely seems... So I don't know whatever happened to kind of some of the monetization around like plugins and other things with ChatGPT, but this definitely seems like a way to kind of get your product... You know, having a wife that owns a business in the direct to consumer space, and sells products direct to consumer, there is this element of trying to figure out "Well, how do I place my product, or how does my product kind of filter up into search results when people are just searching on ChatGPT, Perplexity, whatever?" And so this does seem to be one angle on that, where you can increase chances of being a recommended product, there's payment integrations, API, custom dashboard etc. So there's this sort of merchant program element of the Perplexity AI-powered shopping assistant as well. Pretty interesting. +**Daniel Whitenack:** \[48:17\] There's another -- I don't know if they're using the Stripe API under the hood, but there's another entrant into this, which is Perplexity now offers a sort of shopping assistant with an actual experience behind it kind of built in. So you have the ability to put in like "Hey, I'm doing this project, and I'm wanting to do this and that. What are the items that I need? Help me kind of shop for those." I think that's kind of the vibe. And there's a search that happens, obviously, and it's plugged into various products... In this case, they have a merchant program, which definitely seems... So I don't know whatever happened to kind of some of the monetization around like plugins and other things with ChatGPT, but this definitely seems like a way to kind of get your product... You know, having a wife that owns a business in the direct to consumer space, and sells products direct to consumer, there is this element of trying to figure out "Well, how do I place my product, or how does my product kind of filter up into search results when people are just searching on ChatGPT, Perplexity, whatever?" And so this does seem to be one angle on that, where you can increase chances of being a recommended product, there's payment integrations, API, custom dashboard etc. So there's this sort of merchant program element of the Perplexity AI-powered shopping assistant as well. Pretty interesting. **Chris Benson:** Very nice. I'm looking forward to all of it. Let's just adopt now. I'm ready for all of it. Go.