← Back to Show Notes

Transcript: JWTs and AI - David Sanders

Hello and welcome to another episode of Django Chat, a weekly podcast on the Django Web Framework.

I'm Will Vincent, joined by Carlton Gibson. Hi, Carlton.

Hello, Will.

And this week we are joined by David Sanders. Hi, David.

Hey. Hey, guys. Thanks for having me.

Hello, David. Thanks for coming on.

So you maintain the current JWT package, which is almost mandatory for use with Django REST

framework. So we want to ask you about that and all your other work. But first, what's your

backstory? How'd you get into programming and specifically Python Django? Let's see. Yeah,

I was trying to think about how I would come about this question as I was showering in the morning.

Let's see. I think I probably got into programming for similar reasons to a lot of other sort of

nerdy kids growing up in the 90s um and you know i was like i like video games and things like that

and uh but yeah it also just so happened that i went to a high school which had some

classes for that which i think was not super common in the 90s um i mean i mean i grew up

in the 90s and there was zero programming classes even in a college town right right i mean so it's

pretty early on and so you know this is when computers were just starting to become sort of

household fixtures and uh and the internet was we had email we had email i remember in my high

school we were very proud of that like yeah right late 90s but it sort of stopped at email

no programming classes that's good that gives you an idea that's like a good litmus litmus test for

what we're talking about here because email had just showed up and and it's uh you know to the

average person and it's probably most early form and yeah i remember my middle school actually had

email or like it allowed you to sign up for a unix account and this was in like 93 or something

and that was in boulder by the way where i'm living now yeah well well there there's probably

some similarities because i'm from basically dartmouth um and so dartmouth had its own

homegrown email system way back in the day yeah and i think the high school system yeah they and

then the and they you know created the term ai at some conference in the 50s um but they the

high school's email system was based on dartmouth which was called blitz i believe so i think you

could you know you could aol you could like you know cds were flying around you could connect

that way but to have uh you know high school run email system in the mid 90s was unusual but

probably not atypical for high schools in you know university towns where you had that was some

degree of knowledge that was it because um i remember boulder valley school district had a

had this unix server which um you know at the time i was just like okay so we have that but i guess

later i found out that the server was actually at cu or the university of colorado at boulder

so it was actually located there and i assume it was part of you know uh the technology departments

or the engineering school or something so yeah so so that was lucky and i think that was a big

reason why they were programming classes and why um why i was able to you know pursue that interest

um early on that's interesting as well that you had unix available because when i was you know

i was in the 90s it was for me it was all windows and you know that was great and it was all fun but

i didn't discover unix till i got to university 97 i was like what's this how do i how do i do

anything here yeah right and i didn't actually i started on a macintosh machine and this is like

way before oh wow we're like even really a thing i mean they were but they were definitely kind of

less common well i did i mean i did too if i think back because like in the in the 80s mac

or apple made a big push to get in school systems so the old what mac twos that was what we had

actually that's true i remember that yeah so they they were like you know they were in all the

schools and then they went away and then they came back as being super high price but for a while it

was i mean because i remember my middle school we had a computer class where you learned how to type

and someone uh told me if if you just took the test in the beginning that was an option so i

did that so i remember just playing oregon trail and like manhole for a semester which was pretty

fun yeah that's a good point because i do remember that i always saw max in schools and you know um

and we didn't have an apple 2e we had a macintosh so we actually had a color monitor which was like

13 inches or something um yeah well well i was remembered of remembering sort of that time

because i just um the game missed i played yeah i played around that and yeah yeah and i it was

middle school for me so 93 94 yeah that was me too yeah yeah sorry carlton um no no no i'm loving

it i'm just thinking old man talk about computing yeah no but i was gonna say that early days of

technology is really fond memories yeah well that game was is so good and then you you can get on

the ipad now and since i've been quarantined with a bunch of kids and i've totally forgot the game

so i've been so i downloaded it and i've been playing and it's still so good and then my

year old daughter i was feeling kind of good that i remembered a couple things but it's yeah think

about it she's just like touching stuff she zipped through like more than i did she was like three

times as good as i am yeah so it's sort of humbling this is such an interesting like uh road marker

because i remember the time it came out it just blew my mind how amazing it looked and of course

it was just pre-rendered videos essentially and and it was yeah it was basically written in

powerpoint you know that it was this thing called a hypercard um and so you know if powerpoint had

a scripting language which maybe it does but i don't think that's really a selling point of it

that's that's how you could orient yourself in this conversation like this video game was like

very popular and was written in powerpoint essentially and just embedded videos and

powerpoint slides um well the update is um it's pretty fun on the ipad because they've they've

redone it so you can it's not you can sort of move around a little bit make it 3d and the graphics

are slightly improved but sorry last thing on this because i've been um so there's another

game riven which came after yeah in high school 97 98 yeah and i haven't gotten that yet on the

ipad but i'm going to but i remember playing that there's like four discs on it and you have to go

between worlds and so every time i had to move i had to like swap discs and besides the fact it

was like way harder and i had to cheat to finish it i remember just endlessly swapping discs which

was super frustrating so i'm excited to not have to do that on the ipad coming up soon yeah anyways

total tangent sorry yeah no that's that's great though i i mean this this is in a way kind of the

story of how i got into programming like i just loved things like this when i was a kid probably

a little bit before a lot of other people took interest in these kind of things and um and then

I was lucky enough to have access to educational resources. And, and so it just kind of went from

there. And in school, I sort of did engineering. I kind of didn't, I was kind of half did it.

Um, but then of course, uh, yeah, sorry. Yeah. And university and college. And, um,

but then eventually, of course I ended up doing it for a living either way. So,

yeah when did python factor in right when did you just get into the python world yeah so let's see

um so i basically ended up getting a music degree uh that's my formal education so that's kind of an

interesting tidbit about me which is something i choose it now now i've decided to broadcast that

to the world it seems so um it's not something well musicians make musicians make some of the

best programmers i mean where i am in in boston there's this berkeley school of music and a lot

of those a lot of the best programmers in boston are former berkeley musicians who you know it's hard

to make a living but they are very used to dedicated practice yeah right right and so

it translates extraordinarily well to programming yeah maybe there's a sort of kind of abstract

ability to go deep on things somehow that becomes useful um anyway so let's see uh

where was i going with this i feel like we're going so you studied music in school

yeah oh right right and so so then i can't i was coming out of that degree i just decided to um

you know just get back into technology for obvious reasons because it's it's hard to make it as a

musician um and so some of the first jobs i was finding at the time were like php sort of web app

type stuff um and that was how i kind of reintroduced myself into the tech world and i

had had even a few sort of software jobs before finishing school um but that was sort of a

different era of software and so yeah as i sort of got back into the tech world i think that was

like 2008 or nine it was like right about when smartphones had happened and um you know social

media was starting to kick off a second sort of dot-com boom essentially and so there were lots

of kind of uh web app type there was a lot of web app work available so getting into php and that

transitioned into my next job uh doing python and django and it was all in the vein of like web

application work um and yeah so i that was when i joined a company called fusion box in denver

which i believe is still around um and they were a python shop and they had a number of really

talented python developers doing django development and that was where i got into it and that i think

that was about 2011 or so that's still relatively i mean early ish for django um yeah what would you

say carlton i mean i know it's been out for a while but it wasn't that's popular that's a sort

of similar time um that i found it and in similar sort of reasons you know um 2006 doing php mysql

and then i know 2007 to that found found python and you know i i went to a web conference uh a

tech conference called future web apps and everyone's talking about django django django

this thing that's been released now in home typed in it's like the web framework for perfectionists

with a deadline right that's me yeah you know it wasn't just django it was python as well and it

was like ah brilliant so and then yeah writing writing web apps along with mobile apps and that

kind of same time exactly that same sort of timeline same time zone so about around the

django 1.0 1.1 versions that kind of thing yeah so that was right that was around that was in the 1.0

days and i can't remember what version i it was like 1.3 or something i think when i started

working on it anyway yeah that was how i got started well that's very early days i mean for

me i didn't really start programming till 2013 and then i was i was in san francisco at startups and

it was like rails or django and right i think django was the less popular choice and i'm sort

of a right um iconic uh what's the word not iconoclast um what's the word carlton johnny

come lately no no anyways i was like i'm not gonna do the popular thing um and i like the

look of python so i jumped on yeah um but it wasn't what was the popular thing then react

yeah i mean the company the company i was at was um php um yeah and then and everything was

rails i mean we were right we were right next to twitter square stripe um so yeah it was all rails

boot camps had just started and nobody was really doing um python or django right so i'm remembering

the react of the day was this thing called boost uh what was it um ember no it was it was a thing

by written by jeremy ashkenaz he was the guy who did oh yeah yeah yeah back back back bone

yeah so that was like an early form of like um javascript's uh spa framework you know

what had all the great helper functions that kind of made their way into javascript itself

yeah and did i say his name correctly jeremy ashkenaz it's been so long since i think i think

so right because he's he's done a whole bunch of stuff too he's done some other i'm sure he's

gotten involved in a lot of other cool stuff yeah seemed like a very prolific technologist so

yeah those people so well so how did you um so do you i mean i guess it was a while back but do you

recall i want to get to um your jwt package like did you start working on apis with django was it

server side like what was the progression if you recall yeah so i think that was the reason um

because right when i first started using django i was doing a lot of uh what would you call it

um traditional sort of server-side rendered web apps and of course eventually the industry

transitioned more towards just apis interacting with uh you know front-end clients essentially

written in javascript and so the eventually i went to work for one of our clients when i worked at

fusion box which is specialized bicycles which is kind of a funny like you know place to find

yourself as a software developer some might guess um because you know specialized obviously and i'm

talking about the you know if you know bicycle brands it's the specialized you're thinking of

uh in the united states there's a kind of a bike brand that's that's kind of on the same

tier as tech or trek rather um and yeah that's specialized so they had a number of kind of uh

data collection services essentially that they needed to be written in-house um and that's what

i was working on when i was there and those were supporting all these kind of like um they were

doing a lot of like body metric collection with like these little digital measurement devices

they were making and and so i was writing the apis that would receive data from those devices

and so this is like pre-strava was it gps or was it like power do they have like power um meters on

the um drivetrain um oh yeah so there was things like that um and actually it was really around

the same time as Strava. But our stuff was more just kind of like data collection. And so we had

a few devices that we made to sort of give to people who were running specialized shops. And

they could use it to like measure people's bodies to like suggest products to them and things like

that. So anyway, yeah, so I was running a lot of APIs. And we, like I said, this specialized had

been a client of fusion box and um and so they had already been contracting with us to have some

developer some django uh apps that we were writing and so we had we had a few more kind of like um

traditional apps that were server-side rendered and then we were starting to do some kind of

javascript front-endy stuff with them and so it was kind of a natural transition to just continue

using django and um yeah so was that django rest framework or is that tasty pie what was the time

frame uh it was rest framework because it was it was pretty early yeah right so it's fairly

for rest framework um i mean and i i'm not maybe tom would correct us but it seemed early again

from my perspective well look give give give you an idea the rest framework version 3 kickstarter

was in 2014 so there must have been 2000 i was using rest framework okay it's already on version

one version two in like 2011 2012 something like that yeah and it was kind of competing with tasty

pie and i do remember that it was it seemed a little more newish and maybe sort of unproven

again tom might correct us but uh um but it did seem like i just sort of i got the impression that

that it just seemed a little more kind of clearly organized and things like that and

um and so i just we just made a call to use rest framework for the stuff we needed that kind of

thing for and then i just continued to use it you know at jobs after that so um or sorry we as in

fusion box guys decided to use rest framework so so yeah we were using that for our specialized work

and eventually we just needed a way to sort of do

sort of single sign-on-y, like mobile client off stuff.

And JWTs were, again, from my point of view,

seeming kind of like they were just showing up

or just starting to be used.

And this was probably, this was like, I don't know,

2013 or 14, around that time.

Maybe a little later, 16 or 17.

so did you yeah i wanted to ask so you you created um like the default right now simple jwt but um

jose padilla had a really popular package back then i assume there was i mean his was still

you know more maintained um so do you recall you know why why create a a new one yeah and i was

trying to think about my reasons for this too as far as i remember so i was looking around for like

what was available to use,

and of course I ran across Jose's package.

And I remember thinking that it seemed possible

that it was maybe starting to be maintained less often.

That was the general pulse I got on it

when I was looking into it at the time.

And so I started poking around under the hood with the package

and it seemed to me like there were really just a few classes

he had written which were really kind of driving the package.

And I kind of thought, well, this stuff is,

there's just a little bit here

And so I think I will just take what I need from this and start to just make something.

It was almost like just sort of an academic exercise, too.

I was kind of like, how would I just sort of factor out the essential core of this and make my own version of it and do it in as few lines of code as possible?

And so I started doing that and it just kind of came together pretty quickly.

and so i just decided to go with my own sort of version of it um and of course eventually i got

in touch with him and i sort of let him know that i had done this and he was very encouraging because

he was i think at the time already starting to feel like it was sort of the package was going

into disrepair a little bit and that it was maybe useful for someone to come along and

just sort of take up the reins or kind of reinvent it or something like that and so

it seemed like the timing was good there because yeah i just i i had a need for that kind of thing

i happened to notice that it seemed like maybe there there could be another sort of a reworking

of it um so yeah that's how that's how that happened and yeah so we just started and now

you find yourself in the same position yeah so it's really funny because like i i did it just

because i i just had a habit of i think i had developed a habit in my career of poking around

under the hood of things and just seeing is is there just some like central core to how this

works that i could just some simple concept i could use and um and so i just happened to do

that with jose's thing and i also am just kind of a perfectionist somewhat and so i just thought

well if i were to make this something that people wanted to use then i might as well just kind of

like try to do it cleanly and sort of pretend like i'm making an open source package here that

people might want to use and yeah was it your first was it your first interesting open source

package that you've done for django anyways i mean definitely the first one that took off like

i had done a few little sort of things that were on my github there was sort of like a smorgasbord

of just sort of like half-baked things in my github account um but then i think i think the

real critical like difference was that yeah it was just a time in my career where i just

i just had enough experience to know how to put something together that's reasonably coherent

and i also had i happen to have enough time and a reason to do it and um i also yeah but not too

much experience to hesitate to do it right right yeah I didn't know what I was getting into and so

and I also reached out to Jose and and also Tom Christie because I had sort of borrowed components

of their project to make this thing and there were enough differences that it didn't feel like

just a fork that I could just or like a something I could make a PR for and so I felt obligated to

sort of notify them that i had done this and uh or in particular jose and i was he was very

receptive and positive about it and then also i reached out to tom and i said okay i think in like

i don't know whatever chat channel is linked from the rest framework website and i told him

that i had done this thing and uh i wondered if he wouldn't mind adding a link in the rest

framework documentation to it and that was i think when it really kind of took off and the

the star account started just like exponentiating on github you know so if you linked from the docs

and then you know people are looking for it and you know i think it became clear after a while

that jose wasn't able to maintain um drf jwt and then you came along with the you know the perfect

replacement and it's like yeah that's super yeah and really my goal actually i think a particular

goal i had when i was making it was like i said as few lines of code as possible as transparently

implemented as possible so that if anybody came along was like oh this doesn't do what i want or

this has some issue or whatever then it shouldn't be hard for them to just like modify it to do what

they want and create their own port and that was a big goal for me so i was striving for that but

that makes it more maintainable as well because you're able to say to people hey look that's out

of scope for what i want to do but you know you could quite easily implement it like this off you

go and yeah yeah right right i'll link to the the notes in um django con 2018 i did a talk on auth

with django s framework and um actually where i met carlton and i remember asking jose i was like

hey, I'm doing this talk. Is your package still active? I've been using yours, but there's this

simple JWT one. And he was like, oh, no, no, no, no, use the other one, use yours.

Oh, wow, interesting. It's so funny. Man, there must be like a sort of constellation of

activity in the community I'm just completely unaware of, because I don't really make the

rounds at conferences very much. Well, that was my first one. I mean,

this podcast came out about because going to one I was like oh there's people talking about Django

like I never get that my normal life so I wanted to share it um but he was very I mean we had Jose

on the podcast too there's a past episode but um yeah he was anyways just he was very much like oh

yeah like I use use David's so so I did so it's in there yeah well actually so another thing that

happened which i became aware of is that it seems somebody did a because i i can see sort of like

analytics for where activity comes from or like where people how people get to my github page

for the project um and there was a fair number of links that came from like a screencast somebody

did where they were like this is how you do jwt with django like use this package oh

and I wonder I wonder which one that was it was I yeah I'm not sure if it was like a particularly

well-known screencaster but um but it definitely seemed like there were some clicks coming from

there so well well I mean YouTube has crazy numbers for online tutorials things and I've

learned in particular that anything Django rest framework for tutorials or for Django con talks

um there aren't many of them uh so they they get a lot of clicks uh like for example that talk from

2018 is the has the most views of any talk from 2018 so i think part of me i was like thinking

like well i did a good job but then i did a different talk in 2019 and i'm just like bottom

of the barrel and the top talk is another one on jenga rest framework so it's it's very much like

yeah people just want to learn about that uh because i think also django knots a lot of them

don't know django rest framework or they don't like i i have a book on it and i kind of thought

that'd be my most popular book actually and it's by far my least popular book because and i think

it's because professional django developers basically only write back-end apis if you're

at an agency or a reasonably sized company but i think when you're learning or if you want to do

a smaller project generally you're just in the server side world because it's too much to do

django plus you know whatever shots for framework of the week so so i've found that interest i've

tried to communicate that and i think i'm still not quite getting that across to people they just

don't if they're learning on their own they just don't understand why they would really use django

rest framework um or they don't have teammates to do it so anyways side note i i yeah i so i so

actually so i initially was like jenga for beginners jenga for apis and then this prose

book and it just turns out that apis is actually like the last one people read and it's really only

in the context of once they get their first team job that they do it because i mean because writing

an api is like what's the point for yourself unless and then and then you gotta go learn

javascript and yeah that's a lot for people i think you were hinting at it's probably a big

ask to really sort of expect that somebody just getting into that stuff would be able to

learn all the ins and outs and quirks of javascript and whatever you need to sort of

get up and running with um you know with that kind of stuff and uh and do that in addition to

learning about python and you know server site development and all that kind of thing so yeah

right well and specifically authentication i found you know once once i internalized what a

serializer does authentication is the part where and i think this is still the case from um from

people who contact me that's the part where people go what because you know you don't really

have to understand sessions and session um authentication with django it just kind of works

but then the official docs i think still list four there's four built-in options and then there's

just like a dozen third-party packages and i i don't think it's that much clearer to people

kind of what best to do i mean i think there is a sense of jwts are a good idea but then even

there's multiple ways to do it um so that's the confusion point for people once they get past

serializers is like i didn't know auth could be so hard yeah so maybe make the make maybe make

the case for so why use a jwt or how do you think about that do you um how do you think about that

clearly you wrote the package so you use them but if someone came to you and either cases where you

wouldn't or you wouldn't recommend using it so i remember when i was first considering using them

i think i basically had two needs for them uh like i said we were talking about doing single sign-on

and so i think i was finding information about jwts um because there's this idea that you could

just have the same verifying key across different services and so you only really need to sign on

with the service that has the signing key and so they can sort of function as a single sign-on

system in that way but the other need i had was that uh that we were gonna write an api for a

mobile client and i think we were working with some contractors from a different agency to do

this and those guys were just comfortable using jwts because i think they had gained enough of a

footing in that sort of area of tech that they were already being used for that kind of thing

so yeah so i think those were a couple of reasons i started to use them um i mean as far as like

i still think it's interesting how you can just sort of avoid database hits um to authenticate

requests and that's kind of like uh one of the attractive things about jwts you know because

like you don't have to go query the database to find a session record like you can just verify

the signature and that's all you need and that's like a that's a relatively inexpensive operation

compared with a trip to and from the database um so if you're really trying to go after

performance in the service, then I feel that that could be a consideration.

And I feel like I've tried to sort of emphasize that with that package. I mean, I feel like there

are other ways that that tends to get broken. Like if you want to implement blacklists,

you know, token blacklists and things like that, then you just kind of have to do a database

query anyway. And so you've kind of thrown that out the window. Or I don't know, I mean,

maybe there are ways you could optimize that but you know but yeah you if it performance were an

issue you'd put a caching layer in and blah blah blah and all those kind of right yeah you can you

can cache things and be smart about it but um but you still have to sort of open up a socket to your

caching server and yeah you can't just do it completely within the the confines of the uh

gunicorn process or something you know like you have to do some io in order to serve that request

um but i guess you're doing that already anyway you're doing io with the uh the web server anyway

you know what i'm saying like yeah there there are some considerations along those lines um

and so i've tried to sort of remind people of that because a lot of people come

in the issues of the jwt um package and they're like well why don't you just do blacklist by

default or this or that and kind of like well yeah like ideally you wouldn't have to do that

kind of thing if you don't want to and um yeah so that's one of the things i kind of harp on

in the issues section um as far as reasons not to use it i mean i think in general when you're

trying to make any decision about some technology to use it's just good to understand a little bit

about how it works internally and so in this case you know understand a bit about digital signing

and why you can authenticate a request without hitting a database just by checking the validity

of a signature and so when you understand a few of those basic bullet points about something you're

trying to use then you know exactly why you would use it you know and if you if your you know use

case doesn't match those criteria then just don't use it you know just use oauth or use just

database sessions or you know in other words don't use it just because it's cool and new and shiny

right so yeah i think the reason most people got into it particularly at that time i'm sure there

are other token-based systems available now but back in the you know a few years ago in the mobile

the mobile boom it was like i need a token-based system that's because it's much easier than trying

to get a cookie which i can then send along you know to try and do session auth with a mobile

you know from ios yeah you can do it but you're jumping through hoops whereas if you can just get

a token and attach it to the request bam i'm in the business multiple mobile clients multiple

platforms ah this is yeah this is exactly what we need brilliant yeah that's i think that's where

most people arrived at jay that's how people think of it although i to be a little bit of a stickler

in a in a funny way a cookie is a token you know like yeah well it's a random number which you use

to look up a session so so yeah i don't know it's but i don't want to be a jerk and like try it yeah

you know no no you should be i mean we're all opinions matter um i was going to say so a reader

was recently trying to correct me saying in my book i say well if you have a token you could

store it traditionally in local storage or in cookies and he was making the point of i think

he saw um oh what's his name's um there's um who had that there's a long piece on why local storage

is evil by i'm blanking on it anyways i'm curious on on thoughts around security for where one

should store a cookie or a token because that often comes up in the context of like don't you

know don't use local storage or don't use a cookie it's insecure and i see there's actually there was

a reddit thread on this just very recently and um there's a lot of misinformation out there so i'm

curious uh what both of you think about what advice you would give around storing it carlton

you're a former maintainer you're a django fellow you know i'm just a writer so yeah

you're familiar with this you're familiar with this with this um well i could say debate i think

really i mean you know for sure yeah actually there's been this sort of open pr it's just i

feel ashamed to even mention it there's been a pr open i forever and i have not merged it in

it's just been one of these things where it's like just big enough that it's just too much

work for me to look at it and i never have enough time to do it um yeah and it's this pr to do http

only cookies and to have that be where the JWT is stored and I mean I think I understand the

reasoning behind that because you know that's out of the reach of JavaScript so if you if your site

has been XSSed somehow then you're okay as far as that goes or at least that's my understanding

of it so i i mean i guess that's the trade-off for me like local storage you know javascript

has access to it and if somehow somehow if someone somehow manages to inject javascript

into your site then you're hosed um but if you if you're using http only cookies then you're not

so that's kind of the main difference as far as i can see it uh i should probably read this piece

that you're talking about i maybe i remember seeing it let me i'm just trying to i was just

looking at it yesterday um give me i'll i'll add it um because there's and there's literally like

i mean i almost feel like we should there's there's this whole reddit thread um just on this

very issue and lots of um misconception um what is his name oh randall randall degs deeks yeah

actually i think i should comment that um the reason that was not included by default in my

package is because, as I mentioned, we were using this framework with mobile clients.

And so there's really no concept of a distinction between different kinds of storage.

on a mobile client you know you just it's just on the client right and so you just send it along

with the request in a header and it's up to the client to figure out where they want to store it

and so that was kind of the attitude that I think led to that not being available but that's that's

the main thing that needs to that I just need to like sit down and grow up and just merge that PR

and get it taken care of.

Yeah, but it's not, well, there's two things to say.

One is, from the maintainer's point of view,

it's not as simple as just, oh, I need to merge.

It becomes this rock that you have to carry around with you

all the time.

Exactly, and that's why, yeah, right.

And it's not that you don't care

or that you aren't committed or that you,

it's that, ah, I just literally haven't got the mental space

to deal with this quite difficult topic.

yeah properly and it just you know the maintenance of an open source package becomes

really difficult over time for that exact reason and it's such it's so funny like my perspective

on that whole thing obviously has changed considerably since this package became popular

because i used to be i think like a lot of typical open source users where they just kind of show up

and you know if the package doesn't do what you want it's frustrating you've got work you have to

get done and you just feel like you know why why does the package do this it should do this you

know yeah you feel entitled long essays long essays on the issue tracker about why your package should

be better but no pr to make it better right yeah the time that could have been spent to fix it is

spent complaining yeah for sure yeah i was that person and i i think back on it i feel sad and

now i obviously the tables have turned and i i'm like okay now i get it it's hard like and like

you're saying you just don't have the mental space and you might have a certain standard that you're

trying to maintain yeah exactly quality and you feel obligated to sort of essentially rewrite

people's prs in a lot of ways and um yeah but you you also you you've got x number of users you

can't merge something that you're not confident of and put a new package out you have a responsibility

to the people using it the damage to a broken fix is more than the the benefit of getting the

so yeah it's kind of a conservative as your package gets more popular it becomes increasingly

conservative if necessarily yeah right and actually i've seen a lot of people who sort of

take this approach to um the life cycle of any project where you have like a decreasing version

number and when the version number gets to one or zero or some you know magic just stop merging

you just you can't make any more additions to it and so you sort of like you better be working

towards the definitive like watertight version of the package or you're going to be in for trouble

you know yeah i really like that towards jingo rest framework adopting something similar to that

i think he's um yeah yeah no it makes a lot of it makes a lot of sense i was gonna i think the

thing with rest framework is that there's um like it's on 3.12 to be soon you know 3.11 now 3.12

will be soon just soon as you know we get the you know the pandemic lets us get the prs finished

get it out the door but um you know that what's there is no dying need for django rest framework

version 4 so it's going to be three point you know so then at what point well okay can we can

we refresh the version numbering to say give a more time-based um idea or perhaps to say keeping

up with django and say okay we make version 4 match django version 4 and say you know if you're

to be on django version 4 use rest framework version 4 maybe these are some ideas but the

idea being that django rest framework is kind of the done it's kind of done now it just needs to

keep ticking over and i think it's it's weird because that's actually an extremely healthy

attitude to have towards a software project although it's sort of not fashionable because

you know you always want to give the impression that your project is like hot and there's like

lots of activity going on but i i think it's better to strive towards some core like useful

thing which is like coherent and useful enough that it can just sort of stand on its own

like for the ages you know um and yeah that's like i think that's a a better approach to take

but django is exactly like this right django is super stable 15 years old super stable gonna keep

just keep growing new features all the time but it's it's that reliable rock of the ages thing

and then rest frameworks around it and there's django filter and crispy forms and i know whatever

package that is packages that have been around for for forever and you know yeah they're not

going anywhere part of what allows them to do that is like i was saying there's some set of

core concepts that really upholds the the project or the package and they've just been dialed in

really precisely and they work really well and it's kind of a testament to something that was

really well thought out you know and and it's it's like just uh i mean django is a complex framework

but one of the things i try to go for when i write software is just to simplify as much as possible

like there's some there's just a few lines of code essentially that's doing what what is important

and you just really need to zero in on that i think that that holds true django as well like

you've got the core http handlers they're you know certainly 160 lines in the file in the file

it's not complicated yeah and like the core like form classes and things like there's only a couple

of classes yeah that's right yep i think you have to be at an expert level to see that but it is

it is the case i mean that's like i always think if you can't if something seems incredibly

complicated to you you don't really i mean that's what einstein's thing right like you don't fully

understand because if you can just go well there's this and this and then blah blah blah i mean you

can boil most things down to that but it's not always easy to do that like when i like to total

beginner with django i often try to say like well you need you know a model url view and a template

and once you kind of internalize that for a single page like you're good um yeah i can say that

until i turn blue in the face but um but then when you get the other side you're like yeah or then

you can say like like carlton's talk he gave last year where like what is django it's basically

middleware and it's just like a onion in a way a little bit or um you know but that's easy to say

that doesn't mean anything though anyways what i think is interesting too

yeah i like that um authentication in general is uh is challenging and changes because the web

changes i mean this is uh you know for django itself if we had i saw so i'm on the board and

trying to raise money and if we had a whole slug of money the auth package is something that we

would put it towards if we're going to do updates like authentication in particular is something

that needs work and i think maybe even more than other things changes on the web as as best

practices change would you agree with that carlton yeah i think that's on the list i mean you know

they're discussing jwt on the mailing list just this week and you know um yeah there's comments

about it being an overly complicated protocol and so we you know security issues have come up

because of that yeah so well okay well if we're not going to have there's a talk about bringing

jwt or at least some of the foundations of it into um django so that packages could build on it

and it was like well no not jwt because it's overly complex and then there are better algorithms but

we haven't all better um protocol better uh specs better specs to follow that are simpler and do

the same thing give you all the benefits but without the the downsides of what would they be

what might we do you know this is an ongoing discussion yeah it seems like you could just

take a subset of the jwt spec and just sort of enshrine that as the approved like configuration

of jwt or something yeah i mean somebody mentioned um paste or platform agnostic security tokens on

the list as you know it's supposed to be everything you love about you know jwt or whatever but

without the the design baggage i don't know coming a little bit up on time but i did want to ask you

about what you've been doing these days where i think you've you're currently at a company called

unsupervised and you're more dipping into the ai world is that accurate yes yeah that's right and

of course i just came from i've kind of been through oh ethereum yeah yeah i've been through

an interesting tour of different areas of the technology world in the past couple of years

um because yeah i feel like i can't i can't just skip over the ethereum foundation stuff

um but yeah i had a i'll just touch on that briefly if that's okay before

um yeah because i had a friend who uh and and i probably shouldn't name names too much because

people in the crypto world are kind of sensitive about um being having really public personas

because they tend to be very big targets security wise so um but yeah I had a friend who

uh who I worked with earlier in my career who was kind of involved with that organization from

pretty early on and so that was my connection there and I hadn't previously really been into

crypto stuff like there's a certain sort of category of technology people who are super

passionate about that and super into it and um i wasn't really one of those people but i i sort of

came away and in a way i'm glad because i i came in very sort of innocent like with no preconceived

like ideas about what i was getting into and and even actually a fair bit of skepticism

uh because you know i recognize there's this just sort of inherent hype factor around cryptocurrency

stuff that just to me is just sort of a critical thinker just kind of turned me off frankly of it

um me too and but i came away just being with a different kind of attitude like oh well there's

kind of a very interesting trend in distributed systems happening that's that's that was my

takeaway you know like some people figured out kind of this clever way to have a a shared uh

verified you know cryptographically verified database and other people are also working on

kind of uh next generation like peer-to-peer like file sharing networks and there's just a lot of

interesting kinds of projects in that space and so that was definitely eye-opening and a very

useful experience for me so yeah without spending too much time on that um i just wanted to touch

on it so yeah so but but the thing i should say is that even before ethereum the area of tech i

was starting to just personally be very interested in was the machine learning deep learning scene

and um and i had taken andrew ung's uh i think that's how you pronounce his name

yeah i'm not sure either different things yeah uh andrew ung's uh machine learning course on

coursera um i think i took so you completed it yeah you actually completed it yeah because i

know i think it's like a hundred thousand people signed up and like maybe a couple hundred completed

that or something i definitely and it i think it connected with sort of uh how could i say it like

like one of the things i think that inspired me to get really into programming originally

or one of the things i found myself pondering like really early on when i was learning to

write code is just I think any person who learns to write software has this sort of initial learning

curve of how do I translate my thoughts into discrete steps that a computer could understand

and that's just like a naturally hard thing for humans to do like any person I mean it seems from

my point of view any person who's not accustomed to writing software has this initial phase where

they just have to learn how to think clearly and discreetly like because you know when you ask

yourself how do i like divide a number like well you just do it like it's just kind of a single

motion of like handwriting on the paper and um but when you actually think about how you have to do

it with some like primitive set of operations um it's difficult and so you know there's and and

also there are certain classes of things which are like processes which are very natural to

represent mechanically and there are other things that which are not like you know mathematics

number crunching calculation or i should say calculation not mathematics calculation is very

natural to represent with a computer um but you know recognizing an object in the environment

like humans do naturally in a funny sort of ironic way some of the things we do most naturally are

the hardest to represent like mechanistically right like how do we recognize objects in the

environment how do we even like um like recognize textures colors like these things um it's kind of

interesting when you start you look at the spectrum of phenomena and like you there's like

this gray area where things start to be very difficult for a computer to deal with right and

so i think really early on that was just an interesting thing to me like just observing

how i learned how to write software and how like say if i wanted to write a video game character

and i wanted it to behave naturally and not like sort of awkwardly and robotically like how would

i even do that like i i was asking those questions really early on and so um i also had a good

fortune of sharing an office um at one of my earlier jobs with a guy who had done a proper

phd in ai um way back in the 90s and 80s this is like a totally different era of ai you know

and yeah um and so i think my relationship with him also helped me really get curious about that

topic um and i remember he showed me how to implement a backprop neural network because

this is a concept which has been around since like the 80s or probably even earlier like the

whole basic idea of a perceptron and um like there's some of this really early work has been

in the literature since decades ago and just didn't really come into popularity until recently

for all kinds of different reasons and so so i got some early exposure to that and so

you know fast forward so by this person i shared an office with was in the late 90s

around the turn of the millennium and fast forward uh 10 15 years suddenly you know we have

a few sort of algorithmic advancements um hardware advancements like people start using gpus to do

compute and so all these critical elements come together and there's this kind of renaissance

revolution in deep learning and you know it's just one of the things that I noticed happening

in the technology world I noticed this this starting to happen and I was thinking oh yeah

this is I used to be really interested in this and all these learning resources start popping

up like Andrew Ong's course and things like that and so I just personally got back into it I just

found it naturally fascinating like from a technology point of view from a philosophical

point of view like there's just so much about it which is uh there for me so yeah and so what is

on so you're working on this product or this startup unsupervised yeah so i think

there's like there's only a certain amount i can really say but um the well the tagline is

autonomous analysis exactly that's that's it so i guess that's the headline on our page it's on

the website so i assume you can say that i wasn't even aware of that so that's it uh there's like a

certain sort of collection of things that data scientists tend to do to help companies understand

data and we're basically trying to automate that and we're trying to use ai in order to explore

the space of possible ways of looking at data automatically and also intelligently like not

just like doing a grid search through, that's like a kind of technical term, like grid search

through the parameters of how you might configure a data processing pipeline, but trying to

sort of leverage some actual developments in AI algorithms to effectively search through

that space of possible ways of looking at data.

And so the idea is that we should be able to give a company access to our data analysis system and they can just sort of send their data in.

And the system comes back and says, hey, you should look at this group of customers.

There's a lot of interesting activity happening in this area.

Or you should like, what do you think about this, this way of looking at things?

Is this interesting? Is this not interesting?

and you can sort of like give some feedback and have that modify the behavior of the system and

so yeah does that interact with with django or i mean there's a web component that's a like as a

broad naive question i'm always curious how web interacts with machine learning because you need

to store it and interact with people in some way even though it's not the main focus yeah um and i

I mean, we do have web apps, which are kind of like the interface to the system, obviously.

It's not necessarily Django.

Like, it's just kind of, you know, because I came into the company after a lot of those choices had already been made about what technologies to use.

I mean, it is Python, but not exactly Django.

Is it not exactly Flask either?

Oh, it is Flask.

I can say it's Flask.

I mean, there's two options, basically.

And when you say Flask, that just encompasses everything else.

You know, it's...

Yeah, so we use Flask a fair bit.

Yeah. I was going to... There's this book, AI Superpowers, that came out in 2018 that I read,

which for a layman like myself was a really good introduction to machine learning and the

capabilities. And the author, who I think got a PhD in Carnegie Mellon back in the 80s, and then

he ran Google in China and Apple and Microsoft. And he was making the point, kind of what you said,

that basically we finally have these deep learning algorithms that we were waiting on.

He said there's three things you need.

You need computing power, you need the algorithms, and you need the data set.

And so his argument is we have the computing power, we have the algorithms,

they're all academic, they're out there, and so it's a question of data.

And I think he's from China, he grew up in the U.S., but he's back.

I mean, he has a Chinese point of view.

So he's very much like China is just going to crush the world.

And he gives the example of when with the Go, was it DeepMind?

The Go game in 2013 beat AlphaGo.

Yeah, he says that was a Sputnik moment in China.

Yeah, right, I've heard that analogy used.

Yeah, and he said they kind of realized, oh, wow, this is a big thing.

And he gives examples of how China is sort of throwing cash in sort of a government kind of way, setting up these innovation centers and this and that.

And, you know, that combined with they kind of skipped the landline Internet and went to phones.

He's very bullish on the Chinese approach and kind of glosses over the surveillance stuff.

Yeah. But anyway, that was.

Sorry to interrupt. Go on.

Yeah, no, that's all I was going to say.

Yeah, I just wanted to comment because you said, you know, you need the algorithms, you need the compute, you need the data.

That's what he says, yes.

That's interesting to say because that touches on one of the things that I think we're trying to do at Unsupervised, which is, you know, indicated in the name of the company.

So we're really focusing on this class of algorithms called Unsupervised Learning Algorithms.

And the reason we're doing that is because, in a way, we don't have the data.

people who people who have the data are google and amazon but google and amazon are two you know

very specific organizations which have the data most people don't have the data so to speak because

they don't have the the massive data sets necessary to really train these uh algorithms

correctly or supervised learning algorithms and so there's this whole other class of algorithms

where there's not a need to have a very large very well curated data set or there's less of a need

and so we're sort of targeting that area because we think there's a lot of promise and i think that

would um the world of unsupervised learning algorithms is the the area which has to be

developed in order for ai to really become like the sort of future world technology that we think

of it you know like um ubiquitous like used for everything uh there needs to be a way that we can

find to leverage all the data whether or not it's labeled that's kind of the the key takeaway like

well because i think the again just going off of the book he was mentioning that many of the

the deep learning algorithms you need narrow discrete kind of data like you know is it a car

is it not a car so the quality of the data set is paramount as opposed to what you know yeah maybe

what you're you're saying with you know most of the world isn't organized like that the data so

yeah and actually that's probably how i should have said it like you know obviously we all have

access to a lot of data these days like we can do a google search we can we can just go crawl the web

for just random information but only a few organizations have access to really where

well curated data and that's that's people like when i don't know if that's if that's the same

type of um machine learning where there's the idea that it would like with with go you kind of know

where you want to you you feed data to the to the algorithm that says like this is what a car is and

then the computer figures out its own metrics for determining what a car is so it works backwards

in a way from the solution, right?

I think, I believe that's kind of the central idea

of why you need that structured data

because the computer will find its own way,

you know, or a deterministic game like chess or Go,

it will figure out how it wants to get there,

but it needs to kind of know where it needs to go.

That's why it needs to be structured.

Does that make any, is that accurate?

Yeah.

That was my, that's how I've been thinking about it, but.

Yeah, well, so I guess I can say it's,

i'm wondering how how intelligently i can comment on this um more than more than i can i'm sure so

so when you're when you're training a supervised algorithm you're essentially training a statistical

model yeah you know and so so if the if the statistical variations in the data set don't

include uh information about the correct way to look at the data set then in a way there's nothing

to train on that's kind of one way to say it like like if you don't have this sort of like if you

have a bunch of uh samples you need a notion of fit yeah right exactly you're trying to fit

something you're trying to fit a uh you know a curve or you're trying to fit a certain style

of like categorizing things and if you don't have that information then you can't you can't cue off

that but there are different ways to kind of look at it like you can i think well so i'm not really

a super expert in this area but this is like stuff i'd like to learn more about i think one of the

approaches is to look for the sort of latent statistical information in a data set and then

try and find trends in that sort of latent statistics in those latest it's kind of what

a human analyst does right you they get some data they try they they put it into different

graphing structures and see which ones oh yeah look at clusters like this yeah right that and

the human eye is very good at picking that out and that's sort of one of the early um descriptive

statistics that you do early on before you before you get down to the proper number crunching it's

just looking at it it's just all right look here's the shape of the distribution here's and we can

already learn some things but humans are good at that but i don't know that machines yet are well

so so here's the one sort of uh here's a particular sort of topic which i found interesting

and like i was saying when you train a deep learning model you you have to have this uh

this target that you're trying to get to like you have to have this set of correct classifications

of things you're trying to classify like if you have a bunch of images you need to know is this

a dog? Is this a cat? So on and so forth. A different approach is to train a deep learning

model to simply output whatever it's given. But what you're doing in the process is you're forcing

this data to flow through a representation that's less, what's the word, has less fidelity than the

original input. And so what the model is then forced to do is find the essential patterns in

the data that you're giving to it because it has to it has to find a way so when you're training

this model it goes through this sort of narrow sort of low fidelity representation but then the

goal is to reconstruct the original input and so what the model is then forced to do is to find the

most important sort of latent patterns in the data set and that's like sort of a shortcut in a way

like you're finding this sort of patterns which are just inherently there in that kind of data

and then you're saying okay well well those patterns are are clearly what's important so

we don't have to waste a lot of time sort of training those up from just raw classifications

like that that stuff is going to be there or not whether or not we have classifications and so

that's one sort of i think direction people are going in to try and find a way to leverage unlabeled

data and that's that's kind of tied in with this area of research uh with things called auto

encoders variational auto encoders and um so that's that's a really interesting particular

topic in that area to me which i want to learn more about um but that's just an example of like

unsupervised versus supervised models you know as we kind of wrap up how would you recommend

you know say someone who knows a little bit about python and django but is curious about

ml what sort what would be uh where should they start i mean because andrew

ang's course is sort of famously fine fine fine and then like oh my god math yeah um is there a

way that you would that was my experience with it um what would you suggest to like a listener

of the podcast or a you know co-host who's interested in this area because i think this

stuff is like the coolest stuff right i mean web stuff is a little bit you know it is what it is

sort of a solved problem but yeah right this is all uncharted frontier um i mean i would say

the course is definitely a good place to start i mean you definitely get to that math point

but i think you can just tell yourself like well as much as i can get away with ignoring the math

for now i should try to do that and just just try and go work through the exercises and see how far

i can get um and you probably have to struggle through things a little bit in that sense up front

uh but i think what's cool about those courses is that if you can get things working it's just

really satisfying and it's really interesting to see it work um like i took actually there was

another course i took which was also by andrew hung there's a series of courses all about deep

learning and there was a little bit later on in the series they get into this neural style transfer

algorithm where you you have two input images uh to the algorithm one is an image which contains

some artistic style essentially and the other is an image which contains some content that you want

the style applied to and so of course the output is the content image in the style of the style

image so you can take like a picture of yourself and make it look like van gogh painted it or

something like that and uh and the algorithm works remarkably well and to see it work is just amazing

in my opinion like to to imagine that we're at a point now where a computer algorithm could so

effectively sort of deal with artistic styles in this way that it's doing is just mind-boggling to

me and so i guess my point is just if you can somehow get hands-on with the algorithms and

see them work it's just really interesting and inspiring and i would hope that that would kind

of eventually lead a person to try and learn more about the mathematics because i i personally

believe there's a lot there to that you could get as well but um well you i'm sure you've heard like

these algorithms will you know write a piece like bach or beethoven or haydn and it's crazy how

you know they how kind of like they it sounds like you know they pick up the underlying

algorithms of those composers and it's a new piece and you know maybe it's not quite as good

but it's like it does sound like sound like well i mean so again it's like uh i would i just want

to urge people not to give up on the math because especially in the case of the neural style transfer

algorithm. If you understand a thing or two about the math, in my opinion, it can tell you a little

bit about the artistic process in humans and why people do art at all. I mean, I know people

who are in the AI space are always kind of like a little bit cautious about making pronouncements

like that, like, you know, drawing too many direct comparisons between the ways that like

deep learning algorithms work and the way that biological brains work because obviously there's

a there's well not obviously but it it is supposedly the case that there's a fair bit

of complexity in a biological brain which is just not represented in a algorithm algorithmic brain

i guess you would say um but when i was learning about that algorithm it in my it taught me

something i believe that i just never realized about art and like why do humans at all care

about abstract forms shapes like why why do we care about um representing things from the real

world in this abstract form that we call art like why is that important to us and it seems like it

could actually serve an important function in just the way that the brain works like when you look at

deep learning algorithms there's all these neurons kind of packed in the middle of the algorithm

which are doing things that we kind of don't really know what they're doing we don't we don't

have much insight into it and it seems to me like it's possible that artwork could kind of serve to

to give access to those kind of hidden layers somehow so yeah and again that's like i feel

like that came to me just from understanding a bit about the math of how those things work so

i encourage people to to look into that fair enough that's cool i can't think of a better way

to to end you are looking for help with the um simple jwt package yeah is that correct i've

recruited i've recruited a um a couple people recently who are trying to do some triage just

on issues to um just hopefully help people out with some of the more simple things that come

into the issues section i i've been just extremely busy just with work lately so i haven't had a lot

of time to actually look at prs and merge them uh but i'm kind of aware that that's an issue so yeah

potentially looking for more people to help with that um especially people that have like lots of

just real experience um and you know association with other like prominent projects in the space

And I also was looking into maybe getting it added to this organization called Jazz Band for a while.

Well, yeah, because Django Rest Auth was just merged over, which is a very popular way to do it.

It kind of gives you pre-built auth stuff.

That seems to be successful that they merged that over in the last year or two to Jazz Band.

Yeah, I still am strongly considering that.

I was kind of back and forth on it because I thought maybe it'd still be nice to sort of have a sense of ownership over the project, but it may not be realistic for me to try to do that.

But I was actually having a little bit of trouble contacting them, so I should try again and see if I can get through.

Well, it's a foundational piece of the modern Django web stack. So hopefully someone listening can help us make those connections because we need, you know, we need that package to work.

well and so the good news i'll just remind people is that um it's it's a pretty simple package like

there's just not many lines of code there so if it has fallen a little bit into sort of like

you know if i haven't had a lot of time to look at it recently i don't think it would be very hard

to sort of uh get it back into shape um so yeah you know i'll say i'll say my bit here which i

always say is that contributing to an open source project is a great learning opportunity and it's

a really good way of boosting your profile and it's a really good way of giving back to the

communities and you know if you've been thinking about it you know and you use jade at simple jwt

then you know maybe the planets align yeah yeah please please come and uh and you know make your

contribution and please be patient as people find time to to look at it and whatnot so yeah we should

say that to all the people with the issues like be patient like you've got to realize that it's

it's not some um you know organization with lots of people and lots of hands it's one person

i'll just look at one more ticket like you know yeah well i guess there's one last point i wanted

to say you were saying earlier about you know being one of those people who uh or people who

get worked up about something that isn't solved versus solving it themselves i mean i'm that way

too but a lot of times that's the type of person if someone's passionate enough about something

even if it's in a negative sense that can be turned into something good in the same way that

like for companies i've worked out if someone takes the time to write in or someone takes the

time to tell me my books are terrible i can usually engage and flip that person it's people who aren't

passionate who don't care that have no engagement with yeah yeah that's true for the community so

i think it's good to harness that passion and i always look at it as whether it's an open source

thing or one of my books if someone takes the time to do that and they have that passion

they're basically saying convince me you know convince me and they can be convinced it's the

person who doesn't even bother writing or complaining that isn't gonna yeah that's true

it's a opportunity to communicate well we have links to everything in the show notes thank you

so much for taking the time to come on and yeah thanks a lot guys share your journey it's been

really interesting all right well everyone we're at chat django on twitter and we'll see you all

next week bye