Transcript: Datasette, LLMs, and Django - Simon Willison

00:00.0

Hi, welcome to another episode of Django Chat, a podcast on the Django Web Framework.

00:09.8

I'm Will Vincent, joined by Carlton Gibson.

00:11.7

Hello, Carlton.

00:12.6

Hello, Will.

00:13.5

And we're very pleased to welcome back Simon Willis, and welcome, Simon.

00:17.2

Hey, Will.

00:17.7

Hey, Carlton.

00:18.6

Hey, Simon.

00:19.4

Thank you for coming on.

00:20.2

Really excited to have you again.

00:21.7

So for those who don't know, Simon is one of the original co-creators of Django.

00:25.0

He's currently working on Dataset.

00:26.3

he writes a lot about ai llm and much much more so we'll get into all that but i'd started off

00:32.7

with actually so you were at the most uh django con us i guess last year um but day to day you

00:38.0

don't do i don't think a lot of django so i'm curious how do you see django 20 years in as

00:43.1

someone who is familiar with it but isn't maybe as in the weeds as some other folks how do you

00:48.3

assess its kind of strengths and weaknesses in the web framework uh landscape as it is now

00:53.2

so the thing i love about django today is that django qualifies as boring technology and i i'm

00:59.1

a huge there's this incredible essay that um online name mcfunley and dan mckinley put out

01:04.4

this wonderful essay a few years ago about how you should pick boring technology where what he means

01:10.1

is that anytime you're building something um there are things that you want to innovate on

01:14.6

and where you want to build like something new and exciting and solve problems that have never

01:18.7

been solved before and then there's everything else and for everything else you should pick

01:22.8

the most obvious boring technology you can so that you're not constantly trying to figure out oh how

01:28.6

do i do csrf protection in this framework flame using or whatever just just make sure your defaults

01:34.0

are boring and i love that django absolutely qualifies now right i i never in my wildest

01:39.7

imagination streamed that django would be the boring default choice for building things but it

01:44.5

is and so actually i'm building um dataset cloud right now it's sas hosting for my dataset project

01:50.0

The core of that is a Django app.

01:51.6

I've got a Postgres and Django app, which manages user accounts and manages signups and all of that kind of thing.

01:57.6

And then it launches Docker containers on Fly.io, which run Dataset and all of that.

02:02.7

So all of the exciting stuff I'm getting to innovate on in the corner, but the sort of bog standard bits that make the whole thing run, it's Django.

02:09.4

And that's great.

02:10.5

So, yeah, I love that.

02:11.6

I love that Django is now the safe default choice for building a web application.

02:16.1

Lovely.

02:17.0

Well, so you mentioned user accounts, I have to ask.

02:19.2

So Carlton's had some thoughts on, you know, maybe 20 years on changing some of the defaults.

02:25.5

Carlton, do you want to give your quick pitch and we'll see what Simon thinks?

02:28.4

Okay, yeah.

02:29.0

So my kind of take is that we've kind of got a leaky battery with the user model because we ask users to create this custom user.

02:38.5

And it's a whole world of complexity that for that central auth model, which is like for every single request, the identity of this user is X.

02:47.8

not the profile data, which obviously you want custom per app,

02:50.5

but we sort of have this custom user model,

02:52.8

which we forget to set up.

02:54.4

And we say, there's all these warnings in the doc,

02:56.8

how you should use it, but don't migrate to it

02:59.3

because that's too hard.

03:00.4

And I think we made a mistake there.

03:02.9

I think what we should have done is trimmed off

03:05.9

all the non-identity stuff from that user model

03:08.8

and then locked up Django Country Borth really tight.

03:12.6

Couldn't agree more.

03:13.8

There were four flaws in the default user model.

03:16.7

Firstly, it expects everyone to have an email address, which doesn't work in 2024.

03:21.1

It makes people pick a username, which is very archaic.

03:24.6

And it expects your name to split into first name and last name, which for many cultures doesn't work.

03:30.7

So, yeah, I'm very with you that the user model has not dated well, unfortunately.

03:36.1

Yeah. So, I mean, what I'd kind of like to do is cut it down and trim it, you know, find a way slowly.

03:43.5

It's obviously over time because Django is very stable

03:45.4

and we have the migration policy, but just reopen that debate

03:48.6

about whether we can trim off the bits and somehow do it.

03:53.2

I think we gave up a little too early on that.

03:56.2

So I've been experimenting in that domain.

04:00.1

Yeah, I love that.

04:01.6

I mean, I use the user model as a key that other things key onto,

04:06.1

and then I have a separate table of Google accounts

04:08.9

that have been associated with an individual.

04:10.9

I don't make people pick a first name and last name.

04:13.5

Yeah, all of that kind of stuff.

04:15.8

Yeah, I know.

04:16.9

Okay, good.

04:17.9

Do you use Django a lot or you write your own then, it sounds like, to manage social authentication?

04:24.5

I mean, I use, yeah, so how am I doing social authentication?

04:28.4

I think I rolled my own Google OAuth thing, which keys against Django.

04:34.5

I think I've probably got code for that lying around somewhere.

04:37.5

And I've done that before in the past.

04:39.2

I tend to use the default user model, partly just for the admin.

04:42.7

the most convenient way to get the admin up and running and i always have the i mean the admin is

04:47.1

such a key feature for me to quickly iterate on what i'm doing and build out like internal tooling

04:52.0

and so forth and yeah i get when you're starting a new project as well the last thing you want is

04:57.4

to sit stop oh i need half a day of planning my needs for a user model you just want to start

05:02.3

and then yeah i'm impressed you implemented your own author uh i've done it so many times at this

05:09.0

point right okay so it's nice yeah um i mean uh so my blog runs on django um simon willison.net

05:16.0

and that's open source like the it's not a very complicated application but it's all sat there on

05:20.5

github and i find myself i tweak that about once every three or four months i'll go in and i'll

05:25.7

tweak something about it it's always fun it also um it's managed by depend bot so it magically

05:31.8

upgraded itself to django 5 a few weeks ago it just did it which was great i didn't have to think

05:37.8

about it i've got just enough automated tests that i trust it that the thing's going to work

05:43.4

after i apply updates and yeah that's that that's a nice sort of way of staying staying connected

05:48.0

with what's going on in a very sort of low risk environment as well yeah i saw you put out a post

05:53.2

a little while ago just on the blog topic about um you know how to build a blog in django it was

05:57.8

really good it was like a kind of checklist that you could run through of how to build a

06:01.3

django and i kind of i see everybody struggling with hugo sites and this static site and this

06:06.1

generator or that state site generator and i sometimes think no just run your own django app

06:10.0

because it's great to have a lot of that playground and you've been doing it for like 20 years with

06:13.5

the same django application just evolving it's lovely really it's yeah it's it pretty much i

06:20.0

think my my the very first version of my blog was php running on my university's shared hosting

06:25.3

with flat nice as like was it even was it the php equivalent of pickle i think it might have been

06:31.8

php's pickle equivalent of just a big like array of posts and stuff and yeah then i i flipped over

06:39.0

to django actually only about 2000 and no i i might it's on my blog somewhere when i first

06:45.6

ported it to django and then i did a major upgrade um in 2017 when i came back after not blogging for

06:52.7

like seven years and did the python 3 upgrade and stuff and i've just been iterating on that ever

06:58.0

sense it's great but also jacob kaplan moss his blog is built on django but also if you go to his

07:03.9

github repo it says forked from simon willison i'd forgotten that oh that's brilliant

07:10.4

actually i stole the feature off of him a few years ago he has this idea of a series of posts

07:17.3

around a certain topic and so i added series into my blog inspired by what he'd been doing

07:22.1

didn't he open a pull request to get it merged into the upstream branch

07:26.4

he didn't but that would that wouldn't have surprised me if he had that's a that's what

07:31.1

carlton would have done yeah i don't know i don't know anyway we'll carry on oh it's i guess just

07:37.4

one more um you know putting your kind of old man hat on with django i i've heard you mention that

07:43.4

the fact that you know so flask django the felt the fact that flask can be a single file i don't

07:48.2

know if you kept up with this but carlton did a talk in 2019 on single file django and then at

07:53.1

the most recent django con us palo um melchiori has like there's a whole repo of like i think

07:59.4

it's six lines of like kind of proving like you can do django in a couple lines um and i guess i

08:06.4

wonder i think about this as teaching because i again like my brother-in-law is going through a

08:10.6

boot coding boot camp and i'm like hey i'm here let me help you like oh we're doing flask i'm like

08:15.4

like i almost feel like i don't know if it's worth doing showing like single file blog on

08:22.4

django or something just to make the point that like hey it's possible because even in flask like

08:26.3

you don't no one does it that way you could but no one would do it that way i do i love the single

08:31.2

file thing i actually i built my own django single file thing like 10 years ago something called djng

08:37.5

and yeah that was basically just trying to do a little thin shim that lets you do a flask

08:44.4

imitation on top of Django.

08:47.0

Because I love that for just hacking out quick

08:48.8

things, not having to bother

08:50.9

about the directory structure and so forth.

08:53.1

So yeah, I'm thrilled to hear people are still

08:54.9

pushing ahead on that. It's

08:57.0

a great idea. If we were to design

08:58.9

Django today, I'm certain it would be

09:01.2

capable of doing single file

09:03.0

out of the box. That just makes sense to me.

09:05.3

Okay, one more and then I'll let you go, Carlton.

09:07.3

This question comes from Eric Mathis

09:09.1

who wrote Python Crash Course

09:11.1

and he was asking, you sort of answered it, but

09:12.8

What is your preferred way of building web apps today?

09:15.1

I think specifically on the front end, having seen it go from server-side rendered to jQuery to SPAS and now, I guess, HTMX.

09:23.4

But where do you fall on that pendulum?

09:25.9

So I spent a few years trying to do the React thing because it was clearly the way it was going.

09:32.7

And I hated it so much.

09:34.6

The thing I hated, it's the build script.

09:36.1

I hate it when you have a front-end project which you work on every six months, and you come back in six months, and nothing works.

09:43.7

You have to re-spin up your webpack configuration, all of that kind of stuff.

09:48.2

And so a few years ago, I said, you know what?

09:49.7

I'm going to give myself permission to write JavaScript like it's 2008 again.

09:54.7

And so no libraries, no build scripts, no TypeScript, nothing like that.

09:58.7

Just like a little bit – because the thing is that we used to use jQuery because of the browser differences.

10:04.0

But the browser differences are gone, right?

10:05.6

today document.queryselector all and all of that stuff it works exactly the same across everything

10:10.8

so you can build code like you're using jquery but without using jquery you just write like event

10:16.6

handles and so forth and it was so liberating like suddenly i enjoyed front-end development again

10:21.4

because i didn't find myself fighting webpack and whatever and v whatever the new cool stuff is

10:28.5

and i and i could go back to projects i wrote like this two years ago i can drop in and i can

10:34.0

maintain them and i can add new features to them and on top of that the um like language model

10:38.8

stuff chat gpt is really really good at all forms of javascript so it's not like i ever find myself

10:46.0

stuck trying to remember how a certain api works if there's something which is going to be a bit

10:50.0

tedious because the javascript is going to be 20 lines of boilerplate it'll spit out the 20 lines

10:54.4

of boilerplate i could just just let it go let me get on with it so yeah i've got really into that

10:58.4

I have played with HTMX on a couple of projects.

11:02.7

I really like it.

11:03.5

It fits my – I've always been into the sort of unobtrusive JavaScript,

11:08.1

the idea of progressive enhancement.

11:10.1

HTMX is so good for that kind of thing.

11:13.4

And so I really – I like that.

11:14.8

I love that that's getting popular, and I love the performance

11:17.6

that you get from it because you don't have to serve a megabyte

11:19.8

JavaScript bundle just to share a contact form or whatever.

11:24.3

And then Dataset itself is very strictly, it's just HTML.

11:29.6

And when you click a link, it loads a new page.

11:31.7

But I've been playing with the Chrome view transition stuff recently, which is super, super interesting.

11:39.5

Like cutting edge Chrome, I think you might still have to turn on one of the experimental flags.

11:44.1

You can actually serve up CSS that says, and when the user navigates from this page to this page, keep this area of the page stable and sort of like blur update this other bit.

11:53.2

And it's like a couple of lines of CSS, and suddenly it feels like a SPA.

11:58.6

You click a link, and only part of the page updates and so forth.

12:02.1

But it's a real navigation.

12:03.3

There's no JavaScript involved.

12:04.7

That's thrilling.

12:05.8

I can't wait to see that roll out to other browsers as well.

12:09.2

I have to ask.

12:10.9

Well, sorry, Carlton.

12:11.7

Go on.

12:12.3

I promised the last one.

12:13.2

Just with bundling, because I just did a redesign of my main site, which is using Tailwind.

12:19.6

And I like Tailwind, but it's a little disappointing.

12:22.7

I now have to have like Node and stuff running to,

12:25.4

it's almost like it's switched from JavaScript to CSS now

12:27.7

to have a build script for everything.

12:29.9

Yeah, this is one of the reasons

12:31.5

I've not adopted the modern CSS stuff as well

12:33.8

is I just, the build scripts,

12:36.4

they're fantastic for larger, more complex applications.

12:40.0

The stuff I do, I always try and keep it small

12:42.5

and simple enough that you don't necessarily need that.

12:44.8

And then they just become friction.

12:46.3

Like it's just something that prevents me

12:48.4

from being able to,

12:50.1

because I have so many projects on the go at once.

12:52.1

I've got what a hundred and nearly 200 it's it's,

12:57.6

I've got some ridiculous number of actively maintained projects.

13:00.8

And the only way to do that is to make it as easy as possible to drop into

13:03.9

something that you've almost forgotten all of the details of and get it up and

13:07.5

running again. I feel like with the front end build stack, if you do,

13:10.9

you work on the same projects every day, it's completely fine.

13:14.5

It gives you a huge productivity boost and that there's none of that friction

13:17.7

because you're, you've constantly got that stuff sort of warm in your head.

13:20.9

If you drop into a project every six months, it's completely different.

13:24.5

And that's what I like to optimize for being able to hop across hundreds of different projects

13:29.3

and make small changes to them without getting stuck on the building.

13:33.1

That's the exact same point as the boring technology talk, right?

13:35.9

Is if you focus on one or two or three technologies, then you're able to really get the most out

13:41.9

of them rather than spreading yourself thin over, you know, say half a dozen and that

13:46.2

slows you down.

13:47.0

It's the sort of same...

13:47.7

But the secret to running lots of projects is they've all got to be as boring and similar as possible.

13:52.0

Like I've got 100 odd repos.

13:54.3

They're all Python pluggy plugins, Ginger templates, like Datastep plugins.

13:59.8

They're all the exact same shape.

14:01.9

They've all got GitHub Actions running workflows and so forth.

14:05.0

It just works.

14:07.4

Okay, good.

14:08.1

Interesting.

14:08.8

So you mentioned LLMs there and ChatGDP and things like that.

14:12.2

But I wanted to ask you something before we, you know, talk about those in more depth,

14:16.1

which is you're not, as well as doing all this amazing work in open source,

14:19.4

you're now on the board for the PSF.

14:22.3

Yes.

14:23.6

So can you tell us a little bit about what you're doing there

14:26.4

and how you're finding it because you're new.

14:28.1

This is your first year on the board.

14:29.8

It's my second year now.

14:31.1

I just hit the 12-month point, and it's interesting.

14:33.7

So the reason I'm on the board of the PSF is that I'd been hassling the PSF

14:39.8

on a sort of low-grade basis every now and then.

14:43.1

I'd go, I'm really annoyed that the PSF isn't doing more

14:45.7

to help make python easier for people to get into like solving the the horrors of the python

14:50.9

learning development environment all of that kind of stuff and um and also the fact that it's very

14:55.9

difficult to distribute applications written in python because you know if you want to you don't

15:01.4

want people to have to install python to use your stuff and i realized i almost had a snap judgment

15:05.9

one day i was like you know what it's not reasonable for me to complain at the psf and not

15:11.5

offer to help and not try to do something so i put myself up for um election on the basis of

15:18.2

i want these are the problems that i think the psf should be addressing and i got elected which

15:22.6

was a little bit of a surprise because i didn't really i mean i think it is name recognition

15:26.2

because you show up on the list of names people are oh i recognize that person or whatever um

15:30.6

and of course then i made it now that i'm in the psf i realized the psf is not particularly

15:35.4

well equipped to solve the problems that i was most interested in solving because the psf nobody

15:40.0

told you well it's it's always difficult to understand quite what what these organizations

15:45.3

are able to do the psf is basically a it's about it's about money that's that's raised and is

15:52.3

distributed around the python community and initiated to um the the psf's focus is on the

15:57.9

community and the health of the community there's a huge amount of sort of sponsorship of events

16:02.2

of um of initiatives like that which is fantastic the stuff i care about is it's not completely

16:08.7

aligned with what the PSF is for, but it's not unaligned either.

16:12.4

So what I'm having to learn is, okay, how do, how do I align what the PSF

16:17.4

can do with the things that I want to get done in a way that supports the,

16:20.5

the missions of the organization?

16:21.9

so forth so it's been a huge learning curve you know this is my first time on the board of a

16:25.8

non-profit it's understanding what levers are available to pull and what priorities make sense

16:31.4

and so forth um and yeah so the first year i was in mainly in sort of just trying to understand

16:36.8

what this how this thing is shaped and what it can do now that i'm through that i'm looking

16:41.2

forward to to maybe trying to tweak those levers a little bit myself to put all your weight on this

16:46.9

one this is why the dsf we just switched to two-year terms exactly for this reason because

16:53.1

it basically takes a year to get up to speed and during covid we had less turnover and i feel like

16:59.6

we got a lot done because we had largely the same crew for two or three years um because it does it

17:05.0

takes a it takes a year to just understand how it works right sorry i interrupted though you were

17:10.4

gonna no i was just making a joke about thing but i was good when time was talking i was like this

17:14.6

is exactly will's experience with being on the dsf board is that people think oh the giant the

17:18.6

dsf can do this the gff can do that and they're yeah well what i hear from will is from his

17:23.4

experience on the board there is that actually the dsf can't do very much well it i mean i think

17:29.2

it's interesting i mean we're gonna have jacob kaplan moss on in a couple weeks um who just

17:33.5

joined back the dsf after obviously working on django and being one of the i think the first

17:37.7

president um but when i joined i had a similar list of things i wanted to do which i guess i

17:43.8

in hindsight, I guess I was lucky that they aligned with, they were all things that could

17:47.4

be done around like sponsorship and, um, God, I forget. I have a blog post on it, but I didn't,

17:53.7

I hadn't thought about the fact that maybe the things I wanted done didn't align directly with

17:56.9

the mission, but, but you're right. It's fundamentally, these organizations are about

17:59.9

money and community and helping others. I mean, one thing DSF is, is now doing is having working

18:06.1

groups, which the PSF has had mixed success with, but at least some success, whereas historically

18:11.7

it's just been everything goes to the dsf and when you're on the board that's kind of its own

18:16.2

thing it's unreasonable to be on the board and spearhead an initiative um from what i've seen

18:22.5

i imagine it's similar on the psf or i don't know are you thinking of like you actively doing it or

18:26.6

more you can well like help spin up a group that that's what i'm still trying to figure out because

18:32.2

the other thing is that the psf is a like the psf has staff the psf is a like like the stuff

18:38.5

Unlike DSF.

18:40.1

And the staff do incredible projects.

18:42.6

I mean, PyPI, I think, is one of the most impactful things that the PSF does outside of the PyCon and event sponsorship and so forth.

18:51.6

And so the directors are not there to do the work.

18:54.1

The directors are there essentially to sort of help make those high-level decisions, help set strategy, and, yeah, make decisions about where the money goes to a certain extent.

19:04.0

Yeah, it's understanding, okay, also, what's ethical and responsible to do?

19:09.5

Like, if I throw all of my weight in trying to push the PSF in one direction, am I actually starving other important initiatives that the PSF are doing that just don't happen to align with my own personal interests?

19:20.3

Right.

19:20.4

Well, in a sense, pure is not the word, but quite a few people work for big tech companies, and so there's even more of a potential of, I don't know, not conflict, but it gets a little, you've got to watch what you're doing.

19:34.0

Yeah, I mean, and I think there are rules about how many PSF, how many people on the board of directors can work for the same company as there should be. Because yeah, that's always a risk with these kinds of things. Yeah, I'm being unemployed by a large company that gives me an aspect of independence. But most of our board members are independent. I'm not unique.

19:53.7

Okay. I think maybe when Jeff Triplett was on the board, I think maybe it was the allocation was a little bit different.

20:01.2

But yeah, I mean, we just we just had new board members join a couple of months ago.

20:06.4

So I think we've had quite a reshaping just recently.

20:10.6

OK, just before we move on.

20:13.3

One last one, and I promise.

20:16.9

So, again, as this Django person, but a little bit of an outsider, I can ask you all the questions that, you know, I want someone to weigh in on.

20:24.2

So an executive director, we're going to have Deb Nicholson on the podcast in a couple of weeks.

20:29.4

There has been talk of Django potentially having one.

20:33.0

What has your experience been seeing an executive director at work in one of these organizations?

20:38.7

Can you imagine one of these organizations without one?

20:41.4

So is the DSF considering having a paid full-time person as an executive director?

20:45.6

It's something Haim, the president, and others have—it's been discussed because—and I'll give my two cents.

20:52.3

I think a lot of this stuff won't happen absent someone full-time to do it.

20:57.7

Absolutely, yeah.

20:59.4

that completely makes sense to me like having having this is one of the problems with boards

21:04.1

of directors is if everyone's just a a volunteer who's investing a few hours of their their time

21:10.1

a week or maybe a month it's very difficult to make progress on things you find you'll have a

21:15.4

meeting and it'll be that not much will have happened since the last meeting and with once

21:20.6

you've got an executive director and staff that completely changes you know there's constant

21:24.2

forward motion well i keep on telling people the the i think the best thing about the django

21:28.8

software foundation has always been the fellows because that's the and that this is something i

21:33.6

say i don't understand why other opens like community-driven open source projects aren't

21:38.6

trying to imitate this exactly because it works so well the psf now has i think at least two fellows

21:45.1

inspired by the django software foundation and those are incredibly impactful um you know the

21:49.9

work that they're doing the work that seth's been doing around security is a sort of relative new

21:53.8

edition absolutely extraordinary how much impact you can have with that so yeah i'm i'm very very

21:58.7

keen on the idea of these non-profit open source supporting foundations that actually have staff

22:04.6

that can could just keep on making progress on things but just my experience just having been

22:11.4

the fellow is that these these other tasks these non-fellow tasks would arrive and it'd be like

22:16.3

okay well i'll do that but you know i've got a bit of time in the week i can do that but it wasn't

22:19.7

really the fellow role and there wasn't enough capacity to make any sort of significant progress

22:26.0

on you know for instance you know reworking the janga project.com website okay don't do a little

22:32.5

bit of work on it but it's literally an hour or two here or there and not the massive month-long

22:36.9

project months-long project that's going on now to actually do a proper assessment and what does

22:41.8

it need and how do we refresh it in a sort of professional you know to a 24 2024 kind of standard

22:47.2

you know rather than just oh yeah can you make a tweak here it turns out there's a lot to be said

22:53.4

for having somebody whose job it is to get specific things done you know that's yes exactly

22:57.9

um yeah so i'm i i think that that sounds very very sensible to me yeah i'm biased and i realized

23:04.2

we we i think i think it was anna the past dsf president and i had a had a call with um private

23:10.1

call with with dev nicholson the new the new one and she sort of went through what one does in that

23:16.2

position and we were just like oh my god we so need that um so yeah i put that out there but

23:23.2

i'm not on the board now so but carlton yes can i nudge up so i want to get so i've been using

23:28.9

copilot and whatnot and i think it's awesome and it's it's you know you mentioned javascript earlier

23:34.0

like my javascript's come on so much because they're a bit like how do i how do i filter this

23:38.6

array to get this the one value that i need and previously that would take me 10 minutes of

23:43.1

looking up because it's not something i do you know i do it once every six months but now i can

23:46.6

just ask the lam it's got it and it's not it's not rocket breaking code it's not it doesn't have any

23:52.5

value other than it saved me 10 minutes um so i guess my question is how can i

24:00.6

leverage that and how can i leverage continue to leverage that and and your tooling how can i

24:06.4

install that and can i get something equivalent to the closed source that i can use that's open

24:12.1

source wow that's a whole bunch of things to talk about yeah yes but that's kind of

24:18.5

let's get into it yeah um i'm with you like the thing that excites me about llms is i love them

24:25.4

as as sort of teaching assistants right it's something i can ask question i can ask the

24:30.8

dumbest question in the world at three in the morning and i'll get an answer and it doesn't

24:35.7

judge me and i don't like like you know and i don't feel it's not knowing how to do a for loop

24:42.2

in bash or whatever it is you don't want to post post it on the django forum like how do i do this

24:46.4

Exactly, exactly. I love that. And I love that it lets me be so much more ambitious with the projects that I take on. Because like a great example, I shipped code in Go for I needed a little like high performance network proxy router thing. And I ended up writing it in Go because I don't know Go, but chat GPT, GPT-4 knows Go throughout.

25:11.5

And I know Go just well enough to read the code and be able to tell if it's doing the right thing.

25:15.9

And I can get it to write tests.

25:18.2

So I ended up building this, like, 100-line little custom Go server thing with comprehensive unit tests and GitHub Actions running continuous integration.

25:26.2

I got continuous deployment running, all of the things that I consider to be important for, like, robust projects.

25:32.3

And I shipped it, and it's great.

25:34.1

Like, last month I had to make a change to it.

25:36.2

And I fired up GPT-4, and I worked with it, and we figured out what to do.

25:41.2

and i absolutely i mean that was extraordinary because normally i would never write something

25:45.4

in go because i'd be fine tinkering with it but i'm not going to write production code in language

25:50.4

that i'm not completely fluent in in this case i'm i feel like me plus gpt4 is fluent enough

25:57.3

that i'm willing to deploy code written in a language i'm unfamiliar with i've written code

26:01.2

in apple script apple script is notoriously a read-only language like you can read it and see

26:06.3

what it does it's the there's like a continuum there's apple script on one end a pearl on the

26:11.4

other like read only write only absolutely but yeah i'm i'm using apple script for things i'm

26:17.2

using all of these weird little domain specific languages i use jq all the time now because jq

26:23.1

is really powerful but i can never remember the syntax so i love that i love it as a sort of um

26:30.1

an accelerator for me doing lots of things i'm taking on more projects which is terrifying

26:35.7

because i already had too many projects and i'm like oh i mean me plus chat gpt i can probably

26:40.7

get something working in 20 minutes and of course it takes two hours but still at the end of that

26:45.0

two hours i've got something that works and is interesting that i wouldn't have built otherwise

26:48.6

um but it's that first 20 minutes that you wouldn't have put in that gets you to the two hours

26:53.8

i do so much coding on walks with my dog now because i can be walking the dog and i can on

27:01.5

my phone i can just like prompt it to write me some code that does this i can use the code

27:06.5

interpreter mode where it actually runs the python code it generates so i can get back from an hour

27:11.2

long walk with the dog and i've got 50 loads of python that i know works because it actually

27:16.0

ran the code found the bugs fixed them all of that kind of thing it's incredible like you can

27:21.5

even turn on voice mode i can literally talk to it while i'm on a walk with the dog and it writes

27:27.9

code for me that's utterly surreal that that's even possible so yeah i love i love that aspect

27:34.6

of it um and yet but the as you mentioned the problem with chat gpt is it's a it's for a company

27:41.6

called open ai it could not be more closed right it's this proprietary hosted model they change it

27:47.6

all the time without telling you what they've changed so people keep on complaining that it's

27:51.9

got weaker it's worse at x and so forth i never know if that's actually true because it's basically

27:56.5

random number generator so it's very easy to assume that it's changed when it hasn't

28:00.7

but that's really frustrating and then but the great news is that in the past like 12 months

28:06.6

we've had so many new options for running these things ourselves these openly licensed models

28:11.6

that you can run on your own hardware and they're beginning to get pretty good like i don't use any

28:17.3

of them on a daily basis because gpt4 is so good so it's sort of my default but i'm constantly

28:22.9

experimented with them my favorite at the moment um my two favorites are these mistral models

28:28.0

there's mistral 7b which literally runs on my telephone like there's an app that runs it on

28:33.4

my phone and it's not awful like i was on a plane and i was using it to to do the kinds of things i

28:39.4

might have looked up on wikipedia and okay it'll probably hallucinate stuff so don't depend on it

28:44.0

telling you the truth but it's still useful for sort of getting things getting just starting to

28:49.2

explore different ideas and then the other one is this new one called mixtral which is a mistral

28:54.8

a mixture of experts model they just released that um a month ago and that runs on my laptop

29:00.3

and is feeling it the quality begins to feel like chat gpt 3.5 like it's very very good so if you've

29:08.0

been resisting using these things because you don't want to use some weird hosted model by some

29:12.1

like closed open company mixtral is something you can run on your laptop right now it's a it's

29:17.5

apache licensed that the whole thing is apache licensed although whether it's truly open source

29:23.0

is up for debate because they won't release the training data that was trained on which right is

29:28.0

i think that's the source code right i think for these models the the raw training data is the

29:33.2

source code that was used to compile the model because you can't open source that training data

29:37.7

because you ripped it all off it's full of copyright data and you can't just slap an apache

29:41.6

license on someone else's copyrighted works but yeah so this stuff is really exciting it's really

29:48.0

interesting so i want to come back to your tooling but you've just mentioned the the copyrighted

29:52.9

training data thing and so there's this um these lots of cases where the the llm will reproduce

29:58.9

its training data almost exactly um in in cases so the the new york times um we've got this um

30:05.7

lawsuit perhaps you can explain the thing i could i kind of see it though and i'm like oh wow yeah

30:11.0

it is actually reproducing you know you type in underwater sponge and you get a sponge called

30:16.5

bob square pant come out from the one of the image generators for instance this is so fascinating the

30:22.5

ethics of this entire space could not be more murky like every aspect of this space you're like

30:28.0

wow is that okay and the answer is maybe not it's all very bad and that so a lot of people have

30:35.5

ethical qualms against this and i agree with everything that they're saying you know the

30:39.3

The New York Times thing is – so the most recent thing is the New York Times filed a very big lawsuit against OpenAI a few weeks ago.

30:49.6

It was against OpenAI and Microsoft, and it was complaining about three different things.

30:54.1

It was complaining that, firstly, you took all of our work without permission, and it's copyrighted work, and you used it to train your model.

31:00.8

And I don't think anyone is disputing that that is what OpenAI did.

31:05.6

They used OpenAI, they used New York Times data as part of a vast amount of training data that went into these models.

31:12.5

It's effectively, you could look at it as it's a crawl of a sizable chunk of the Internet that was used to train these things.

31:18.3

But that included New York Times data.

31:19.8

The New York Times say that OpenAI put more weight on the New York Times data than they did on other data they trained on because of the high quality of that training data.

31:30.8

So I don't know if that's conclusively proved or not. I think the GPT-2 paper a few years ago did explicitly list that the New York Times data was being used like that. So they might be assuming that that's still true. It probably is still true. But this is one of the things I'm excited about this lawsuit is I want discovery because I want to know how GPT-4 was trained because they haven't told us. So, you know, if that comes out of this, that would be useful.

31:54.9

So complaint number one, they trained without permission. Complaint number two is that the models can spit out exact copies of New York Times articles. And this was news to me. I thought that the act of training muddled the stuff up to the point that it won't spit out exact copies.

32:10.5

it turns out if you set the temperature to zero and then feed it the first two paragraphs of a

32:16.5

new york times article it can often spit out the next four paragraphs and sometimes there are very

32:22.3

slight differences like one word will be changed but effectively it's it's it's memorized and it's

32:26.7

regurgitating the same thing but if you if you if you tried to publish that that would be clear

32:30.8

violation of copyright it would be exactly and then so the question is well are open ai publishing

32:36.1

that just by having an interface where people can see it.

32:38.8

And that's, I mean, so many of these things, I don't think there's a, obviously,

32:43.9

legal i'm not a lawyer at all but there's a reason this is going to go to court because

32:48.5

these are legal questions that are very blurry and unanswered so complaint number two is it

32:53.4

can regurgitate their content and they've said um this means that people bypass our paywall by

32:59.1

getting the model to spit out articles which is a bit of a loose claim because you've got to have

33:03.8

the first three paragraphs of the article anyway but they did have a really interesting thing where

33:08.2

they talked about the wire cutter right where the wire cutter is a new york times company it does

33:12.3

product recommendations if you ask chat gpt for product recommendations it will often spit out

33:17.9

the wire cutters picks but it won't give you the referral link that's the wire cutters business

33:22.7

model and this is the the definition of fair use in american law specifically talks about um whether

33:30.3

the thing is competitive with the thing that it ripped off and so the new york times case the main

33:35.1

thing they're trying to demonstrate is this competes with us this is harming us financially

33:39.4

because you can bypass that paywall you can like rip off wire cutter recommendations all of that

33:44.6

kind of stuff so that's argument number two complaint number three is actually about retrieval

33:50.1

augmented generation it's about the thing that um microsoft bing does and uh chat gpt browse does

33:57.1

where you can ask you the question it goes and does a search on the internet and it'll find the

34:01.4

new york times article about something read bits of it and then like summarize that and give you

34:05.5

the summary back again and so then the new york times is saying well look you're clearly subverting

34:10.4

our paywall you're you're profiting from content that's derived from us now that one that's one

34:16.0

one it's almost the one that worries me the most in terms of i think they've got a completely

34:20.1

fair point in complaining about this but summarizing stuff is my favorite use of llms

34:26.0

like if we come up if we end up with legal precedent that you can't even copy and paste

34:30.9

data into an llm to get a summary back out again that would be very harmful for for the sort of

34:36.4

the ways that these tools are most useful but that's the problem is that i read the 69 page

34:41.8

lawsuit and it's very clean it's very well argued and i think like oh like i said not a lawyer but

34:49.4

all of these points feel to me like points that are worth putting in front of a judge and jury

34:53.3

and and trying to get answers about yeah i think i mean two things come to mind from what you've

34:59.7

just said one is um i know google has been um told it has to pay news publishers in various

35:06.2

countries at various times because it does exactly that if you google the news in the country in

35:10.3

australia i'd pick australia i don't know if it's applied in australia but it will go and you know

35:14.7

get the sydney morning herald and summarize that without you ever having to leave google.com and

35:20.6

they were you know the one thing about that is that those lawsuits they were just about the

35:25.2

headlines the headlines like the first few words of the story even and what these what generative

35:30.7

ai is doing is so much more than that much google are clearly going to be if the the google are

35:36.7

clearly going to be on the chopping block next after opening after opening microsoft because

35:40.8

they've got a prototype um like an alpha version of their search page that does exactly that it

35:46.2

just adds generative ai and it spits out a generated answer to your question at the top

35:50.8

they've been doing this with their like little content snippet boxes and so forth as well over

35:54.5

the past few years and it's super worrying right if you've got a web where nobody ever clicks a

35:59.4

link from a search result because they just get their answers right there in search what point

36:04.1

is there in trying to like build a profitable web business anymore you know so all of these

36:09.2

ethical complaints are very very legitimate here's a meta question for you so we know now that

36:15.5

llms are being used to generate a lot of the content on the internet how do you see this

36:21.6

going forward if the lms are going to be really trained on themselves do you think that like is

36:25.7

this is 2021 the the you know the the high point or is there a way out of that because it seems a

36:32.2

bit like a vicious circle it seems like an ouroboros situation does doesn't it and it's

36:37.3

people have been talking about this for a couple of years now and um at one point i heard that

36:42.0

open ai the reason they hadn't updated their training data like there was a training cut

36:46.8

top of what september 2021 i think and the reason they had updated it is that after that point there

36:53.0

was enough usage of these tools the internet was beginning to fill up with llm generated text and

36:57.0

they didn't want to train llms on llm generated text because of the ouroboros effect at the same

37:02.7

time in the openly licensed language model community almost all of the really good ones

37:08.3

are actually trained on gpt4 output like the way you the way you build a really useful um like chat

37:14.7

tuned language model is you need to give it 20,000 examples of good conversations and the easiest way

37:20.4

to get those is to get gpt4 to spit them out and then you train your model on gpt4 and so if it if

37:26.2

that was such a bad thing we wouldn't be seeing models that were trained almost exclusively like

37:30.7

that show up at the top of the leaderboards so i think this is all i mean this is all part of the

37:34.6

larger problem that we really have very little insight into how these things work they are giant

37:40.0

like 16 gigabyte blobs of floating point numbers we're and and we're still trying to figure out

37:47.4

just the basics of how you sort of poke around inside that weird matrix brain and figure out

37:52.8

how it's working and what it's doing and so yet maybe the fears of llm's training on llm output

37:58.2

are don't don't actually work out maybe it's okay maybe it's complete catastrophe we have no idea

38:04.3

and it's funny that we had no idea six months ago and it feels like we still have no idea now

38:08.3

So despite the rate at which this technology is improving, the rate at which your understanding of it is very sort of dubious in terms of how much we can figure out.

38:18.4

So it really is a new world.

38:21.1

It is. And as a computer scientist, it's infuriating, right?

38:24.3

Because I like computers that do exactly what you tell them to do.

38:27.8

And you can write tests and you can fire up a debugger and everything is repeatable and understandable.

38:34.8

And these are not that at all.

38:36.2

It's like a completely sort of weird, blurry alternative world in which everything's based on vibes.

38:43.0

You come up with, you pick a model and you poke around with it and you see if the vibes feel right.

38:48.5

And then you tweak your prompts.

38:49.8

And does that seem better?

38:51.0

I mean, it kind of does, but it's awful.

38:53.7

It's really difficult to do sort of responsible development on top of it.

38:58.3

It does seem like the closed LLMs, like, you know, like if I'm a hospital or if I have billing records or like very niche-y things,

39:05.8

LLMs are fantastic. And especially like I'm in Boston, there's a lot of research places. They're like, we can't use an open LLM thing, but these closed things are definitely being sold and used on whatever industry company has huge amounts of their own data. I would say I almost feel like that's got more promise than this, like the entire web being, you know, stolen approach in the long run.

39:29.9

Well, the flip side of that is, so Bloomberg built their own, they trained their own language model on the internal financial documents. It was supposed to be the best possible LLM for finance. And then it turned out that GPT-4 came out, and as a general purpose model, it was beating the Bloomberg one on financial tasks.

39:50.2

But this is one of the things that's so challenging right now is the rate of improvement of these

39:54.5

things such that if you've got a project that will take six months, you maybe shouldn't

39:58.5

do that project because you might spend six months on it and then GPT 4.5 comes out and

40:03.8

it solves the problem that you just spent six months trying to solve.

40:07.2

And so there's this interesting strategic problem where at what point do you actually

40:11.2

settle down and start building on this stuff as opposed to thinking, you know what would

40:14.7

be quicker is if I waited two months and then started building because I'd get a better

40:19.2

result than if I started building today. And that's absurd, but that's genuinely the position

40:23.7

that we find ourselves in. That's Zeno's paradox for the 21st century. Completely.

40:30.2

Well, have you ever read, there's this book, AI Superpowers, that's a couple of years old now

40:34.8

by a Chinese American. He works in China, a US researcher. And I read that, I think I read that

40:42.2

five years ago. He summed all this, this is before OpenAI came out, but he basically said,

40:47.5

you need three things. You need the algorithms, which finally, like we had at that time, you need

40:52.3

training data, and then you need processing power. And he argued with the cloud that basically it all

40:57.3

came down to data. This is back in the day, because we had the algorithms, they're basically

41:00.8

open source. We have the cloud computing. And so it's really all about training data. I think he

41:06.1

went on to say he thought China would surpass the US for that reason, because it has no privacy

41:11.1

controls but all of that is to say to you where do you do you see it as tweaks in them is there

41:18.1

more to juice to squeeze out of these llm models do you think or is it really more about crap like

41:23.6

a data science thing where it's all about what you put in and trying to optimize that i'm trying to

41:28.5

pick between the two i'm very confident it's both um mainly because if you look at the open model

41:33.5

community over the past like since since since february people just keep on coming up with new

41:39.3

little tricks that make the models run faster and smaller like the fact that i can run a gpt 3.5

41:46.3

class model on my laptop now and i certainly couldn't do that a year ago because like the

41:52.0

models that were coming out and like the first versions of llama and stuff were much larger

41:56.7

required much more hardware much less optimized um so there are so many techniques that can be

42:02.1

used to make these things and i'd like smaller and faster right i want i want a model that works

42:07.1

on my phone and can do the things that i need to do i wanted to be able to summarize and extract

42:11.2

facts and call functions and all of that kind of stuff um but at the same time people keep on

42:16.9

finding that the higher quality the data the better like it really is so much to be said

42:21.4

especially when you're fine-tuning these models for just having super super high quality data

42:25.7

that you feed into them um if the new york times thing plays out one way we may find that it's no

42:32.1

longer possible to just steal the entire internet and train your models on it at which point

42:36.3

that becomes raises some really interesting questions the thing that worries me most about

42:40.5

that is does that mean that llms then become incredibly expensive to build because of the

42:45.7

licensing costs to the point that you don't give them away for free and so does that mean that only

42:50.0

people who are very wealthy can afford to use these tools whereas today anyone who can afford

42:56.8

an internet connection has access to some of the the best in class of these models so that really

43:02.2

scares me like the the that that i feel despite the fact that the um the the ethics around copyright

43:07.9

i mean there are very very real concerns here but at the same time a world in which only the

43:13.1

the most wealthy have access to the to these tools that feels unfair to me as well yes and we can't

43:18.8

lock these tools up they are super useful like to take them away would be foolish like also if you

43:24.7

banned them i've got a usb stick with half a dozen models and you create a blank market of people

43:30.1

it's very cyberpunk right people swapping usb sticks with like with the last version of

43:35.8

mistral that was released on them super so there was a paper a little while ago just

43:41.9

pick up what you said there about open ai saying we haven't got a moat or something like that um

43:47.9

there was a leaked memo from google it was somebody within google put this memo together

43:52.2

saying saying there is no moat for this technology um it's interesting to revisit that that i think

43:58.6

that was it was quite it came out in maybe march or april of last year and it's interesting to look

44:03.4

back at that now and say okay how much of this played out because uh one of the real challenges

44:07.7

with this stuff is um if it's all just driven by human language prompts the cost of switching to

44:14.2

another language model might be as simple as saying okay we'll run this against claude instead

44:17.8

of gpt4 and maybe that will give you the exact same effect right um or maybe it won't because

44:23.7

so much of the the prompting comes down to these very small tweaks that you make where you're like

44:27.8

oh okay if i capitalize the instructions to output in markdown maybe it'll actually listen to me this

44:33.0

time but that effect itself is kind of hurt by the fact that openai upgrade their own models so

44:38.3

just because that won't work now will it still work in a few months time it's it's kind of

44:43.1

uncertain. So that's part of it. There's also the fact that the closed model providers are up

44:50.4

against tens of thousands of researchers around the world collaborating together. That's something

44:54.6

I really like about the open model community is there's all of this sharing and this acceleration

45:00.0

that comes from just having tens of thousands of people worldwide all trying to solve these

45:04.8

problems. And OpenAI are an incredibly talented, experienced set of people, but I still don't like

45:11.3

their chances against tens of thousands of people around the world although of course when those

45:15.6

people around the world figure something new out cool open ai can just take that research and use

45:19.5

it themselves so so you can they can sort of keep up that way um but yeah it's and there's also

45:25.7

there's the compute right like it's we still don't know why gpt4 is so much better than everything

45:31.4

else um the most likely thing is that they ran it they trained it for longer and they trained it on

45:37.9

more data than anyone else has been able to do yet but still people are catching up now that there

45:44.0

is if you have a hundred million dollars maybe it's worth trying to funneling that into data

45:49.8

and training you know that it's not like there's a shortage of investor money floating around the

45:54.5

space at this point yeah i guess and the economics are so crazy because yeah it's a hundred million

45:59.8

dollars but then then it's just a file that you know anyone you can sell to anyone for virtually

46:05.6

nothing right when people people often complain about the environmental impact of language models

46:13.1

where they say well look training like training a language model takes this enormous amount of

46:18.4

carbon dioxide which is true at the same time it's about the same amount of carbon dioxide as flying

46:23.8

a boeing 747 across the atlantic twice you know which is a vast sum but i would argue it benefits

46:30.8

more people because your airline flight benefits the people on that plane the language model if

46:35.6

it's then used by a few million people over the course of six months it feels like you are getting

46:40.4

more value for your for your for your sort of carbon dioxide at that point yeah i have a

46:45.3

question about the um carbon the co2 usage so i my understanding of ml which is machine learning

46:51.3

which is quite limited but it was that the training was the hard bit but then once you what

46:55.3

you get out of um the training algorithm is a kind of vector operation which you can run almost

47:01.6

you know quite cheaply and then i saw though people complaining um and i didn't have the time

47:08.0

to follow up but that every time you generate an image with dali or or whatever that uses so

47:14.3

much water or so much this because it's still computationally expensive to run the model not

47:20.0

just train the model is that true so this is an interesting question so like i said i run models

47:24.5

on my iphone i run models on my laptop i am not worried about their resource constraints um but

47:30.2

again i don't know what gpt4 is running on i'm pretty sure it's running on a full server rack of

47:35.6

of gpus so my hunch is that for the very large models yeah there's a lot of cost in the inference

47:41.8

i still think it's a fraction of what it costs to train them that's that's the intuition i've

47:46.3

i've gotten from this um and you know like the image like stable diffusion also runs on my phone

47:52.1

so there are versions of these models where the environmental impact of running them is no worse

47:57.8

than turning your laptop on that's but but i don't really have good insight into what the large

48:03.5

hosted models are doing so i had so we've talked all about llms i wanted to ask about your tool

48:08.6

because if i want to run this you've got the perfect tool for me to download and do so please

48:12.8

tell us about that because we we've talked about all the exciting things okay okay if i actually

48:17.4

want to do it what do i have to do so i built this tool in python called llm i got lucky llm was

48:24.5

still available on the package index. So you can pip x install llm, and you get a command line tool

48:30.5

for interacting with models. But what's really fun about it is that it's inspired by dataset.

48:36.6

It's all based around plugins. So out of the box, you can give an OpenAI API key, and it will run

48:41.8

against OpenAI. And then there's about a dozen plugins you can install that will add additional

48:46.2

models, including models that run on your own machine. So you can essentially pip install

48:50.9

my tool, and then pip install a plugin

48:53.9

that adds a language model to it.

48:55.3

And now you've got a four gigabyte file

48:57.3

on your computer that you can start interacting with.

49:00.1

But crucially, the interface is the same

49:02.0

no matter what model you're using.

49:03.5

So it's LLM space, double quotes,

49:05.8

your prompt or you can pipe things into it as well you can do cat my file dot txt pipe llm

49:12.1

and then if you by default it'll use your default model if you stick dash m space claude on the end

49:17.9

and you've got the claude plugin it'll run it against claude and so forth and um of course

49:22.0

everything it does is log to sqlite because i do everything with sqlite so one of the great

49:28.1

things about using this tool is that it's a way of building a sort of database of all of your

49:33.0

experiments across all of the different models so i just use it on a like daily basis for all

49:38.4

sorts of different bits and pieces and i've accumulated like a few thousand prompts and

49:42.9

responses in my sqlite database of things that i've tried out maybe at some point i'll do some

49:47.0

analysis on that and try and start comparing models that way but really the fun thing about

49:51.3

it is i'm trying to make it so whenever there's a interesting new model you can install a plugin

49:57.2

and start playing with that model and that works for hosted models and it works for local models

50:01.4

as well um and yeah it's it's really really fun to hack on one of the things i've realized from

50:08.0

playing with it one of the original ideas is um the unix philosophy the unix command line of piping

50:13.3

things to other things is an amazingly good fit for language models because the language models

50:17.7

it's a function you you pipe it a prompt and it gives you a response and so one of the things that

50:23.2

i use my tool for is um it's um it it ties into this concept of system prompts which is something

50:29.4

that open ai did originally and other models have started picking up where you've sort of got a

50:33.8

second prompt that gives you instructions about what to do with your other data so a great example

50:39.3

is i can take i can take a um file and i can say cat my file.py pipe llm dash dash system write me

50:48.5

some unit tests and then the model gets the prompt write some unit tests and it gets a bunch of

50:53.4

piping code and it'll spit out a bunch of unit tests and of course they won't be exactly what

50:57.4

you need but it's that skeleton that you can start hacking on it's really good at explaining

51:01.9

code i pipe it code and say explain what this thing does um i use it for uh release notes not

51:08.0

to publish i i kind of feel like it's rude to just straight up publish something that an lm wrote for

51:13.8

you because i mean what are you doing right like it's it's fine to take as long as it's fine to

51:20.3

publish something which you're willing to sign your name to because you at the very least reviewed

51:24.6

it extensively and hopefully revised it and tidied it up but there are lots of projects out there

51:29.3

that don't bother writing good release notes and what you can do is you can check out their git

51:34.2

repository and you can do git diff between this version and this version pipe llm dash dash system

51:41.3

write release notes and gpt4 can understand a diff format it'll it'll read it and it'll spit out

51:47.7

release notes which in my experience are about 90 correct and 10 slightly wrong or maybe there's

51:54.0

hallucination in there and that's fine right that's good enough for my purposes just saying

51:57.5

okay what have they done in this release that they didn't bother writing release notes for

52:01.0

so yeah i i i recommend trying this thing out partly because it's fun to play with models

52:06.5

and something i'll say about the models you can run on your own laptop is they are kind of crap

52:11.6

like they are they are very very weak compared to gpt4 but that's a feature because it's easier

52:18.0

to build a mental model of how they work when you work with the weak ones like gpt4 because it's so

52:23.5

good you can use it for a few days without really seeing the weaknesses and the flaws in it because

52:28.8

it gets most things right but it's still just you know guessing what word should come next it's

52:34.1

doing the same kind of thing the little ones will hallucinate wildly which is so useful for getting

52:39.4

a feeling for okay these things are not intelligences these things are dumb autocomplete

52:45.1

that's just been scaled up to be able to cope with lots of things um i love i use myself as a test

52:50.2

thing because i've been around on the internet for long enough that these things can answer

52:53.7

questions about me like i can ask for a bio and some of the models will get most of the details

52:59.7

right they might say i went to a different university or whatever and some of them will

53:02.9

just hallucinate wildly and so i've had models tell me that i i co-founded github and things

53:08.3

like that um and it's amusing but it's also quite good as a sort of like just an initial sniff test

53:15.0

to see, okay, how good is this model

53:18.4

when it comes to hallucination and that kind of thing?

53:21.0

Okay, super.

53:21.9

We're coming up on time a little bit.

53:23.2

I wanted to add one positive note,

53:25.2

which I've heard about, you know,

53:26.7

we mentioned that these tools could further

53:29.0

increase the economic divide,

53:32.0

but they are democratizing a lot of things,

53:33.9

like in unexpected ways, at least to me.

53:35.6

Like, for example, someone I know

53:37.2

is an admissions director at UC Berkeley,

53:40.1

and a friend asked that person,

53:42.4

hey, you know, what is it now? What is it like with these college essays now that

53:47.1

ChatGTP exists? And he said, it's actually great because it's an equalizer because rich kids have

53:53.5

had private essay tutors for forever. And now everyone has, you know, 80%, 90% of it. I mean,

54:00.9

it probably makes them all sound kind of the same anyways, but it's a tool that people who don't

54:04.9

have these external resources, if they know how to use it, can, you know, up the, you know,

54:08.8

It's just like Grammarly and all these tools to help increase the writing.

54:12.3

And and so that's I was pleased with that because I think it's very easy to get a little

54:15.9

doom and gloomy about it.

54:17.4

But it is for almost no money bringing these resources to so many people who didn't have

54:23.2

them before.

54:23.8

Oh, I couldn't agree more.

54:25.0

I feel like we always get very hung up on the many ethical flaws of this technology

54:29.8

and the harmful ways that can be used.

54:31.2

The positive ways they can be used are just enormous.

54:33.9

Like the reason I'm spending so much time with this tech is that I do believe that it's

54:38.1

genuinely useful and it does genuinely provide enormous amounts of value to enormous numbers

54:42.9

of people um if you have english as a second language this tool is phenomenal right you can

54:48.8

now you're no longer cut out of those things in your life those parts of society where you need

54:54.6

to be able to write like somebody who's a native speaker who's at a certain level of education

55:00.3

and that's that that has been completely flattened i am like people sometimes say oh it's not worth

55:06.4

learning to program anymore because the chat you'll just do it all i think that's complete

55:11.0

rubbish i think now is the best time it's ever been to program because anyone who's coached

55:16.6

somebody learning to program has seen that the first six months are just utterly horrific like

55:20.7

it's it's so frustrating because you try something and you get this obscure error message that

55:25.0

doesn't make sense to you and you can bang your head against it for two hours and maybe you give

55:29.1

up lots of people do give up they assume that they're not smart enough to learn to program

55:32.7

And it wasn't that they weren't smart enough. It's that they weren't patient enough. Nobody warned them how tedious and stupid this stuff is. And now we can give them a tool. We can say, look, if you get an error message, paste it into chat GPT, and nine times out of 10, it will tell you what to do next and how to get out of that condition. That's phenomenal, right?

55:50.8

the the flattening of that learning curve getting more people my my ideal end point of all of this

55:57.1

is i think every human being deserves the right to have computers automate stuff for them like

56:03.5

i can do this right i've got 20 years of programming experience if there's anything

56:07.8

that i can tedious in my life the computer can automate i can get it to automate that thing

56:11.9

but it's ridiculous that you need 20 years of experience to do that like that should be a

56:15.8

It's a universal human ability, and I think this technology might get us there.

56:20.9

I feel like if we get to a point where people are able to get those tedious automated things just done for them

56:27.8

because they didn't have to learn to program first, that feels enormously valuable to me.

56:32.6

Yeah, absolutely. I'm just lighting up as you spell out that scenario.

56:37.7

One of the parallels to that is why hasn't technology pervaded more deeply throughout the clerical world, for instance?

56:45.8

It's like people are still doing with paper or sort of manually repeating a task on a computer.

56:51.3

Yes, it's in a spreadsheet, but it's not automated.

56:53.4

Why not? Because they never picked up that programming skills.

56:56.2

But all of a sudden, if these assistants are built into Excel or built into Word or built into the software they're using, it can be automated easily.

57:08.1

I heard a horrifying story the other day about a a local fire chief, like the guy running a fire department who, due to some mess up, had to manually unsubscribe 2000 people from a mailing list.

57:22.0

And he spent a full day clicking the unsubscribe button over and over and again in some horrible piece.

57:26.4

This is somebody who has a very real, very important job to do.

57:30.6

And I think this pattern plays out a lot.

57:31.8

A lot of people with a lot of important things in life

57:34.5

end up stuck for a day doing something tedious and manual

57:38.0

because we haven't given them the tooling

57:40.7

that lets them not have to do that.

57:44.3

So yeah, so I'm really excited about that.

57:46.7

I think as an educational assistant, it's amazing.

57:49.2

I think one thing that isn't necessarily talked about enough

57:51.9

is these things are actually very difficult to use effectively.

57:55.3

And they feel like they should be easy

57:58.0

because it's just a chat bot.

57:59.2

But actually, to really get the best out of them, you have to understand the prompting techniques.

58:03.2

You have to know what it can do, what it can't do, what are the things that it's going to break on.

58:06.8

I love that we've created computers that are bad at maths and can't look at facts for you, which are the two things that computers have always been best at.

58:16.4

So people sit down and check, well, it got maths wrong and I asked it for a fact and it couldn't tell me the answer because that's not what it's for.

58:24.3

But that's really not obvious, you know?

58:27.0

But it can now, right?

58:28.1

Isn't that the whole thing with Sam Altman?

58:30.6

One of the things is it can do math now, allegedly, the new version?

58:34.9

I mean, well, it can if you give it tools.

58:37.5

So ChatGPT, the paid version now has access to Bing search, so it can look up facts, and it has access to Code Interpreter, so it can run mathematics using Python, which on the one hand, it does fill those giant gaps.

58:50.8

On the other hand, it makes it even harder to use because now you have to know what Bing search is.

58:54.5

You have to understand bits of Python.

58:56.1

You have to know this is the kind of thing where it's got vision support so it can read documents.

59:02.8

But the interaction of all of these features is incredibly complicated.

59:06.2

A great example is sometimes I will give it like a photograph of a receipt and ask it to add up the numbers in the receipt.

59:13.9

And it will then write Python code that imports Tesseract and use Tesseract OCR to pull out the numbers and then it will try and add them up.

59:21.6

But of course, Tesseract isn't as good as GPT Vision, right?

59:25.3

If it had taken that image and used its built-in OCR to pull the numbers out

59:31.2

and then passed to Python, I'd have got a more reliable result.

59:34.8

How the heck am I supposed to explain that to anyone?

59:37.5

Like, I have to know what Tesseract is.

59:41.3

Kind of.

59:42.0

But you see what I mean?

59:42.9

As they add more features, the matrix of complexity of how the features interact

59:46.9

gets even more complicated.

59:49.0

So having expert skills to use this stuff gets harder.

59:52.9

But I think you said this in a recent interview.

59:55.3

I mean, using a chatbot is one of the worst user interfaces.

59:58.5

It's like terminal, right?

59:59.5

We're not, like, no one's using terminal, right?

60:01.6

So it's, do you have any, and I guess the final question for me,

60:04.2

do you have any predictions on, you know,

60:06.2

the mouse equivalent of where this stuff goes?

60:08.8

Because we're not all going to be using chatbots forever, I don't think.

60:11.2

I certainly hope not.

60:12.2

Yeah, like, I mean, yeah, like you said,

60:14.3

the problem with chatbots is there's no discoverability.

60:16.8

Like, you've just got a blank box to start on.

60:19.4

And they might give you a couple of suggestions,

60:21.6

but it's a terrible user interface.

60:24.9

But that's the thing that's really exciting about the space right now is there is so much low hanging fruit.

60:30.6

Like you could sit down and just come up with an alternative UI for interacting with language models.

60:36.6

And right now, maybe you'll invent the thing that everyone will be using for the next six years because there's been because we're so early in this process.

60:43.6

There are so much scope for innovation around how we use these, how we interact with them.

60:48.3

And that I find really exciting.

60:49.6

I love that now that people are beginning to understand this tech and what it

60:52.7

can do, like we need designers on this stuff.

60:55.9

We need user experience people.

60:57.7

We need, but it turns out machine learning nerds are the worst possible people

61:02.2

to actually make use of this technology.

61:04.6

Because they're thinking in terms of, you know, they're thinking in terms of,

61:07.8

okay, well, I've got to optimize my gradient descent or whatever.

61:11.2

You don't need to know what gradient descent is to innovate on top of language

61:14.1

models.

61:14.5

That's almost a distraction from what we're trying to,

61:16.8

what we can achieve with them.

61:18.7

Brilliant.

61:19.2

Brilliant.

61:19.6

Okay. So we are going over. I did want to ask you about Dataset. So it was kind of a run short. So

61:24.8

can you give us the 30 second, what's new in Dataset? We've talked to you about it before,

61:29.3

but what's hot? The most exciting new feature in Dataset, I've been building this feature

61:33.7

called enrichments, where the idea is that you've got, say, a CSV file with 10,000 addresses in,

61:39.9

and you load that into Dataset and you want to see them on a map. So you need to geocode those

61:43.9

addresses. With enrichments, you can have a plugin that lets you select the address column and say,

61:49.0

geocode this and it'll go and churn away against the geocode of your choice and it'll populate

61:53.4

latitude along a two column next to it but crucially these things are all built as plugins

61:57.6

so you can have a plugin that does geocoding i've got a plugin that does just regular expression

62:03.0

extraction of things and i've got a gpt plugin so you can say take this take this database table

62:08.9

run this prompt against every single row and then put the output of that prompt in this other column

62:14.4

And there's one example of that. It can do the GPT vision thing. So I actually fed it a database table with 100 URLs to images and told it to write me descriptions of those images. And I got back three or four paragraphs per image describing what was in the image right there in my table.

62:30.4

and of course now i can search against that and do all of that sort of stuff so i'm really excited

62:35.6

about that it means that data set is evolving into more of a data cleanup and manipulation tool

62:41.6

which is a departure originally it was about publishing exploring data um but i realized that

62:47.4

the problem i most want to solve especially around journalism is if somebody gives you a hundred

62:51.6

thousand rows of data what the heck are you supposed to do with that right especially if

62:56.3

It's slightly too big to put in Microsoft Excel,

62:58.7

but you can't afford to hire some programmers

63:00.8

to build you like a custom Django Postgres app

63:03.1

for this thing.

63:04.0

What do you do?

63:05.0

And if I can build plugin-based tools,

63:07.0

especially with Dataset Cloud now,

63:08.6

so I can host them for people,

63:10.2

where you can upload your CSV file,

63:12.4

click, click geocode,

63:13.8

wait a couple of minutes as the progress bar fills in.

63:16.2

Now it's all geocoded.

63:17.3

Now you can visualize it on a map.

63:18.6

That's really exciting.

63:19.9

And yeah, and the enrichments,

63:21.2

I tried to make it as easy as possible

63:22.8

to write additional enrichments as plugins.

63:25.3

So I'm hoping to see people building their own enrichments

63:28.3

for all sorts of other data transformations they might want to pull off.

63:32.1

I have to ask just one more question then.

63:34.0

If you're doing it on Dataset Cloud and I've got an enrichment

63:38.3

and I'm going to give you some software,

63:40.2

did you find a solution to how you can run my software in a sort of trusted way?

63:45.1

Well, at the moment, I can review your software

63:47.7

and make sure it doesn't have any faithful holes in it.

63:50.1

And then I've actually – Dataset Cloud has a feature now

63:52.2

where I can basically say to this customer,

63:54.5

pick install this additional package.

63:56.9

So I do have that now.

63:58.2

And also, Dataset Cloud, I built it on top of Fly.io

64:01.4

precisely because they offer secure containers.

64:04.5

So with Dataset Cloud, every customer gets a separate container.

64:07.6

So if you somehow manage to screw up the security in your container,

64:12.2

it's isolated.

64:13.3

That's a problem for you.

64:14.3

It's not a problem for other customers in the system,

64:16.0

which felt really important to me.

64:18.0

Yeah, okay, good, good.

64:19.4

Because I know you've been noodling on that problem for quite a long time.

64:23.2

Yeah, I still want to be able to run WebAssembly server-side reliably for untrusted code.

64:30.2

That's like my ultimate goal.

64:32.7

Because, yeah, I want users to be able to say, here is some Python code, run this against all of my data to transform it without risk of them breaking things or whatever.

64:41.5

And it feels like we're almost there with WebAssembly.

64:44.9

And that would be amazing.

64:45.8

If I can take untrusted Python code and run it in a WebAssembly sandbox that's locked

64:50.3

down and that can't do network access and can't reach the file system, that would be

64:53.5

amazing.

64:54.6

Okay, super.

64:55.3

We could go for another hour, but thank you for taking the time to talk about all these

64:59.7

things.

64:59.9

We're going to have links to everything and, you know, Dataset, Dataset Cloud, Enrichments.

65:04.4

Those are, I think, the three big things that fans of yours should go take a closer look

65:08.3

if they're not already familiar.

65:10.1

Cool.

65:10.5

Yeah, this has been really fun.

65:11.7

Yeah, I'll put together some links to this as well.

65:15.5

Okay. Thanks, Simon. Thanks so much for coming. That was really awesome and illuminating and

65:19.6

filled in so many questions around a really hot topic. So super.

65:22.7

Thank you everyone for listening. We're at DjangoChat.com and we'll see you next time. Bye-bye.

65:27.0

Bye-bye.