← Back to Show Notes

Transcript: Production APIs - Calvin Hendryx-Parker

Hi, welcome to Django chat podcast on the Django web framework. I'm Will Vincent joined

by Carlton Gibson. Hello, Carlton. Hello, Will. This episode, we're very pleased to

welcome Calvin Hendricks Parker from six feet up. Hi, Calvin. Hey, Will. Hey, Carlton. So

it's been over a year since we last spoke last time we talked about conference, loud

swarm projects you're working on. The new version of Python web conference is coming

up very soon. So I thought I'd ask you to say a little bit about that virtual conference and

what people can expect. Okay. This is the fourth year. Yeah. Fourth year for the Python web

conference. We started in 2019, uh, pre pandemic, but fully virtual. Uh, it's always been a virtual

conference since the very beginning. Uh, we felt it filled a gap, uh, where there weren't a lot of

web talks. It seemed like going on at various other conferences and gave an opportunity for

those who couldn't normally make it to in-person conferences, a great place to kind of gather and

see great talks from the people in the python community who rock doing all kinds of work um so

this year is going to be march 21st to the 25th so if you're hearing this now go grab a ticket

really quick there's probably plenty of space uh left since it's a virtual conference we don't have

constraints like catering and things like that which is nice but we will have um definitely

more attendees this year than last we've already had probably about 3x the signups that we have

of last year which last year was double uh so i was super excited about the growth of the conference

itself, the speakers this year are fantastic. If you go to look at pythonwebconf.com,

you can see on the speakers page, it's my favorite page to go to, because it is a beautiful page of

just amazing folks who've, you know, given their time to come and speak at the conference. And I

think you'll find it's just not your same boring old panel of people. There are really diverse,

interesting folks from all backgrounds telling all kinds of stories. We're going to have, you know,

two tracks of app dev we've got a cloud track a pi data track and a culture track and then we

added in a tutorial track two years back and so that's actually running all five days so we've got

six tracks almost all day it's it's a kind of starts in the morning eastern time goes to about

one or two in the afternoon u.s eastern trying to make it friendly for most folks who could join

but we're expecting people from all over the world we've had you know 30 plus countries last year

you know almost 20 time zones uh covered uh for folks who are joining one cool thing with the

the platform we're on it's built in django uh the loud swarm platform actually allows for the

recordings to be posted right away so we're actually you if you miss a talk or there's

conflicting talks in about 10 minutes after that talk those talks are online and actually people

are time shifting and watching you know 24 hours a day and they're all hanging out in the in the

chat we just use slack for chat and so you'll see you'll wake up in the morning and find people

from all over the globe chatting about the talks they were watching or or python or open source or

who knows what we've got some really cool socials planned as well so each day there'll be some kind

of a social activity one of our speakers is giving a talk on burnout and she's going to do a breathing

exercise and a mindfulness exercise that i'm really looking forward to i think that's really

important for folks during you know these obviously troubling trying times uh for us to focus on if

you can't focus on yourself and help yourself first, you can't help anybody else. So you need

to make sure that you're taking care of yourself. So we've got a whole session on that plus a social

dedicated to some mental health. And I'm looking forward to that. The keynote speakers, again,

amazing group of people. The last one on the last day is a gentleman I met out in San Diego in the

last year. And he's going to talk about a project he's been working on, a real passion project for

him and i think it it should hopefully i think it'll bring the room to tears uh the virtual room

to tears but yeah it should be tons of fun i'm super excited the community really shows up for

this one we've got over 90 speakers which is i mean i think last year we had 60 so we're looking

at you know 50 more more everything it's all all more nice fantastic yeah that's amazing conference

it's it's been interesting too because everything everything's been virtual the last couple years

The Django cons have been, I think both US and Europe are planning to be in person this fall.

But it's, you know, even hopefully once we're back to normal, there's something to be said

for a virtual conference because you can have more speakers. It doesn't limit people. I mean,

it'd be interesting to see if there's some hybrid approach or how that works going forward,

because a lot of the, all the Django cons, they post the videos pretty quickly thereafter to

YouTube, but there's something about the immediacy of almost real time.

person uh in two months like maybe into april yep so i'll be there yes um but python python

month conference will always be virtual that is the the format that we want it to be in we really

want to reach folks who couldn't normally come and if you have a need or some means by which you

can't make it to the conference because of you know financial constraints let us know uh reach

out the group is definitely willing to make sure that all the people who want to be in the room

can be in the room and i'll be at jenga con as well so i'm looking forward to hopefully seeing

you all at Django con us in Portugal or are you just San Diego? Oh my gosh. That's, that's

definitely on my bucket list. Um, when they announced it was going to be in Porto two

years ago, I got so excited, um, to go planning my train journey since the day it was first

announced. Yeah. You, you were going to do a whole caravan Carlton, right? Take all your

kids and, um, yeah. Well, whether that happens cause they're back at school. So I might have

to go by myself what a what a bummer yeah dang it oh god i wanted to ask you calvin about um

the loud swarm platforms we did talk about it a little bit a little bit um the last time you were

on the show but yeah i'm sure it's come on in the last year and you've you've made enhancements so

perhaps you can talk technically there but um perhaps as a kickoff i wanted to ask about what

was the motivation for loud swarm creating python web conference was that oh it was 100 the need for

a better way to engage during virtual conferences the bulk of the platforms are pretty rigid and

don't allow for a lot of that kind of tightly coupled engagement between chat the speaker

q a the the podium experience afterwards of talking to people you know the the whole social

aspect of it and we just thought there's definitely had to be a better way to do that

so we built one um it's been really an interesting journey the technology we used underneath the

covers for loud swarm actually has helped us build other projects for other clients which i'll talk

about today because i think we're going to talk about one of our other projects that we worked on

and a lot of the the ideas that got spawned during the loud swarm project have now shown up in a lot

of our other projects especially when it comes to things like serverless and scaling django and

doing things like web sockets and and you know real time those are all really interesting topics

that came out of that but yeah the the loud swarm experiment has been totally fun uh it gives our

developers a chance to kind of you know breathe a little bit stretch and uh and work on technology

that's that's a little more interesting than some of the other client projects not that any of our

client projects aren't interesting but there are definitely aspects where they would love to try

things and not have the risk of it being on a production project for our client but okay

something that's really owned by us and built by us i think that's a really interesting thing

just from a developer point of view is because you know django excels at solving solving the

web problem quickly like you've got a template view you've got a template you get your html out

there you're tired of in some javascript static files does its things problem solved oh wow what

did i need i need a template view and a url conf well that wasn't that exciting but then i get you

know there's all this other stuff you can do which you hit basically a whole bunch of buzzwords there

i don't mind which of those you kick off on well carlton you're maintaining channels currently i

believe right or helping to well i'm sort of not maintaining channels at the moment yeah to the

extent anyone keeping channels yeah i'm keeping channels alive basically i i it's very much like

okay this is where i want to spend my time this is where i want to you know my my volunteer open

source but there hasn't been any of that time for probably six months now um and okay that will come

back and i'll you know i take it over but yeah channels is in a sort of maintenance mode at the

moment i close a few issues and make sure it's compatible with the latest version but it's still

waiting for it you know the next i know i feel i feel you on that one i i wish i had more time to

spend on it as well i feel like channels is a almost a critical technology for django due to

the the real-time web sockets and the fact that applications are much more like applications

than they are stateless web pages anymore.

Are you using channels

or what are you using for the WebSockets on LoudSwarm?

Oh yeah, absolutely.

100% channels.

We actually just,

so the latest rev of the stuff we did

was mostly to bring everything up to date.

So latest versions of Django,

latest versions of channels,

latest versions of Flower,

which is a troublesome story

because those things,

the various, it's Carlton laughs

because he knows the version dependencies

between these things are sometimes tricky

to get upgraded right.

So most of the work has been done on just shoring up the platform,

making sure it's always on the latest stable updates of things.

Same thing for the front end.

So it's a React front end.

So we updated all the React bits here and there,

solved some little niggly type bug things.

But generally, the platform is how I want it.

It really solves the problem well.

It gives people a great place to come and view content together

and chat and socialize.

So can I ask a question as a consumer of channels?

so loud swarm is consuming channels and i i look at it and i think you know basically it's it's

kind of okay you know the core consumer layer the ideas they're they're nice and solid and there's

not much to it and it's kind of simple and what's missing is like the the maturity and the the extra

features around the top um would you think would you say that's fair or would you would you point

to no actually this is where our problems are and we need that fixed oh i think that's 100 fair

I think the simplicity of the framework allows it to be used in any way we want much more easily than if you had kind of been a little more opinionated with certain features.

So the fact that a consumer is just dead simple to write, it makes WebSockets much more approachable to folks.

So if you've ever tried just talking to a raw WebSocket with WebSocket Cat or whatever on a command line, and you just get a stream of junk back at you, it makes all that a lot more approachable.

Some open source projects maybe are feature complete, and as they should be.

I don't know if we've got a bunch of wishes for it necessarily, other than to keep it easy, simple.

It's easier to understand.

It's easier than to probably maintain.

Easier to keep it secure.

I got one.

I got one for you.

It's probably, and I guess the issue really doesn't,

it's not channel's issue, it's WebSocket issues,

is that there's no formalized authentication mechanism.

So a lot of times you're contriving your own means

of validating or authenticating or authorizing

who should be connecting to a specific channel

and what kinds of things they're allowed.

So you have to basically come up

with your own protocol for that.

Maybe some opinions are needed there

to just guide new folks in that realm,

because it's not straightforward.

No, I mean, I don't have an instant story

as well i mean a lot of the a lot of the issues that come up on the repo are things like um

getting it working with nginx or get um or a client keeps disconnecting and you know what

javascript library should i use and quite often i'm a bit like i have no idea what javascript

maybe you should use i'm sorry i do i just don't know calvin can i ask about um how the deployment

so you have the django backend hosted on server and then the react front end what are you using

for the react front end for hosting because there's you know there's netlify there's all

sorts of options i'm just curious your setup so actually in the end there's no servers involved

for the loud swarm project and for this like other project i wanted to talk about which is actually

quite beautiful i love the fact that we don't have to maintain you know buntu machines and go

in and patch them we do have to rebuild images so it's all done with docker images that are

deployed onto fargate in amazon which is basically a a more easy version of kubernetes um that's

probably doing it a disservice calling it that but you i want a place to run containers i want

a place to run my celery tasks i want a place to run my channels workers and fargate makes it very

easy to deploy the back-end pieces like django and the workers and celery and all those pieces just

as tasks is what they're called

inside of the Fargate ecosystem

and for the front end we're just using

React that's hosted on S3 bucket

there's nothing fancy

about the front end pieces because they all

just it runs solely in the browser

it connects back with

WebSockets through CloudFront

and the load balancer and it

actually use the load balance we split the load balancer in two so that web sockets traffic goes

to a whole separate pool of of workers for the channels bits and the api calls all go to a

dedicated set of workers just for the django bits they're using the same image but they have

different characteristics they need to scale differently based on memory usage especially

in the case of web sockets because each socket connection takes up so much memory you want to

scale that separately from the api servers which actually are pretty low utilization most things

are cached when possible we use a lot of redis to cache complex queries back so that you know

anything coming in for like the big schedule for an event comes most of the time right out of redis

and can i just say you'd be using whiskey for your base api and then asgi for the um just for the um

async website web socket connections it's all asgi so it's all daphne uh yeah so we're using

daphne on both both the web sockets and for the the django front end although we're not doing

anything fancy on the django side uh as far as async other than talking to uh cellular workers

and the the channels workers so i think we talked last time we were doing a discord integration on

the back end for loudsform that cellular work that's not a cellular worker that's actually a

channels worker so we're using the background workers feature of channels which is pretty

awesome to basically start things on run so when django starts up there's no easy way to kick off

a request to something or maintain a a stateful connection to anything other than like a database

connection this allows us to actually say on run run through these set of asynchronous tasks and

Some of them can be short-lived, and some of them can be long-running.

And that's how we start up and actually connect to Discord servers

to establish a WebSocket connection that is server-to-server

and not server-to-client.

Yeah, well, I'd really love it if you could write, you know,

a company blog or something, write something up about the architecture here

because I think within the ecosystem, there's a real lack of, like, patterns

and, you know, everybody's still breaking the path through the snow kind of thing.

That is true.

I guess that's definitely one of the things that being opinionated

versus being, you know, a little more open about how you do things. And it's unfortunate for new

people. It's hard to grasp that. It's hard to not know, like, what's the one way? There's not always

a way, but there's definitely some patterns. So I think that would be a good idea.

That rings true. I just did the update to my Django for APIs book. And, you know, the only

way I know is just by asking around to people, you know, it's much simpler, but still the new

update gets more complex. And yeah, it's like people have questions. I'm like, you know, if

there's if there is a consensus in the community i'll tell you but sometimes there isn't right so

maybe as a lead and speaking of case studies you do have a whole bunch of projects on your site

we'll link to and one of the ones we want to talk about i don't know if this is switching over too

early but the flash predicting lightning strikes yes this is such a cool project um so for folks

who aren't aware of six feet up you know we're just a python and just we're a python and cloud

consulting company but we recently well in the last couple years have set like our 10-year

objective like where does what does six feet up want to accomplish of course across a 10-year

time frame and really it was to get 10 impactful projects in 10 years so when we think about

impactful projects we were trying to define a do-gooder like what is it what is a do-gooder

these days like that it took us two years to try and define that and we still failed at defining

do-gooder but we did define what impactful meant to us and so the project i want to talk about if

you go to our site there's a there's a link for our impactful projects it's basically an acronym

for criteria we use to measure a specific project against to see if it's impactful or not and the

one that you're asking about is really cool because it's doing something that you know as a

kid you're obviously fascinated with the weather and or even growing up like you see lightning

you send thunder although maybe most people i don't know if most people do or don't i grew up

in the midwest and so there's yeah you get more of that yeah there's lots of awesome thunderstorms

and so you'd watch these things roll by as a kid and i just i'm obviously you know infatuated with

it a little bit but we had a a friend of us ours from the python community call us up and say hey

i've got a a friend who wants to basically take a python notebook where he's developed an algorithm

for predicting lightning and you may think well haven't we done this and we haven't uh all the

lightning watches and warnings that you get are based on knowing where lightning is striking now

not where lightning will be and so this is actually a new thing that doesn't had not

existed until we helped him put it into like a production state so he developed this algorithm

in a just a python notebook a jupiter notebook and obviously you can't deploy a jupiter notebook

into the cloud and say start start selling this as a service to folks but he has come up with a

way to give basically 99.6 accuracy and a 15 to 25 minute lead time like so he can say within you

know 99 plus percent accuracy lightning will be here and within a side camera where the radius is

but just yeah it's pretty accurate it was kind of interesting you know as we're working through

the project with the client at some point he was actually with a race team down in daytona for the

the daytona 500 they had him basically in the pit trying to helping to predict if because if you're

not familiar with florida florida is one of the lightning capitals of the world and so they run

car races in a lightning capital of the world they want to know when this is coming so they know how

long they can go until they have to kind of call things and that's actually one of the big reasons

for this project is that there are lots of activities where if they didn't have to, if they

could shorten the window by which they had closed down certain kinds of activities like planes

taking off or landing or sporting events or construction equipment being energized or not

energized because there might be lightning in the area. I mean, the amount of savings globally is

kind of ridiculous. Like it's a very, very large number. I hesitate to even fathom a guess at what

that would be but the ability to not only just save lives like if you know little league games

are going on and you can accurately say we probably should you know get everybody take

some cover you know 20 minutes early to the fact that you can save money from the delays and flights

cause millions hundreds of millions if not billions of dollars to the economy each year

and a lot of times they can't they have to be conservative obviously there's these are people

in planes and it's it's a big deal but if you can narrow those windows the the amount of money and

time saved by people is tremendous so that's why this project was important it because it has this

impact on the planet the globe people um there are now you know there's obviously more commercial

sides of this where if i'm protecting if i'm an insurance company protecting giant equipment on

construction sites if that equipment can be de-energized when lightning hits it then there's

no damage or less damage to a piece of equipment apparently and so they want to be able to use that

information in an automated fashion to you know automatically shut down or turn on these these

pieces of equipment or tell the airlines when they can and can't you know open up their windows for

people to take off fantastic also that would be like kind of handing back to the future type

scenarios where you need to power your delorean yeah i mean it's kind of cool because when you're

working on the project you start using the api yourself to be like hey i wonder if you know you

may hear a little rumbling you know in the in the sky and so we will go and run it and see like

where the lightning's coming and when it's going to hit that's that's such a cool instance of using

you know django to make a web version of of something else right like i'm in i'm in boston

and there's all sorts of science and medical stuff and i meet people all the time who are

who want to put it up on the web and you know there's less of a need of real time or

these other things but even just taking a jupiter notebook and putting it up on the web

you know is is challenging and there isn't a one-size-fits-all solution so that's for sure

but that's you know every every stem grad student or researcher wants you know a web version of what

they're working on and if someone could solve that drag and drop uh that would be worth a little bit

i don't forget like you say there's no one-size-fits-all i think a lot of times flask

is probably an easy answer for these kinds of things but maybe you don't have authentication

needs or maybe your needs aren't sophisticated you just need to get an api on the web where

choosing like django was is an easy choice you know django rest framework is super easy to get

you know something off the ground with quickly uh we're using the nox um api tools yeah we're

doing security so you can actually easily rotate api keys and like enforce you know single clients

at a time, authenticating and using the API at once

and also kick off people who shouldn't be on there.

So it's a nice tool for generating things like API keys.

Because API keys, when you think about doing APIs,

I don't know, for me, the first thing that doesn't come to my head

is how do I authenticate and authorize people?

I just want to build the cool tool that's going to return some cool data

to folks to do something interesting with.

Now I've got to figure out how do I track their usage,

maybe build them, how to like protect certain usage, you know, levels, all kinds of like that

back office stuff that goes into making a thing real. So the only reason I know of Django REST

Knox is because I had a version of it in an earlier version of the APIs book. And I think

actually, maybe I referred to it when I did a DjangoCon talk on Auth. But that is the, yeah,

the challenge of, you know, with JWTs, that's one of the appeals is you can time limit them and

stuff. Though there's, I actually, I'm curious to ask you, it's almost like the Docker question.

And how do you feel about JWTs?

Because I feel like everyone is all on board with them.

And my sense is now people are like a little less

completely on the JWT bandwagon for authentication.

It feels like it's all over the board.

Cause obviously I consume APIs in various projects

and it seems like every org has varying levels

of sophistication with doing API tokens.

Some it's still like basic auth and like,

oh man, like I can't believe we're still here.

The bearer tokens or doing like the, you know,

the rotating bearer tokens where you,

that seems to be a good pattern

that is easy for folks to like comprehend and use.

Like it's easy to plug into Postman to try it out.

Like you're not, it's not a stretch.

So I don't, I don't have a huge opinion on it

other than make it easy.

I mean, there's some systems out there

where I've had to like sign the requests

going back with the cookies and the tokens and then you get back a signed response that you have

to then validate the signature and it was all for a like a video streaming service with nothing

sensitive like i was like this this is that extreme for that you know it's hard though i

think right it's just like it's just this total like oh my god just make it work um even even for

otherwise knowledgeable clients i guess right i mean for loud swarm we kept it simple and same

thing we did for this flash project which was the django rest uh nox tokens let nox manage the

tokens so it has a api endpoint for authenticating and then once you authenticate you get your token

and at that point you can manage those tokens and the parameters of those tokens with with nox which

is i found to be really just a nice elegant solution to that problem now on the front end

we also used um i think it's django sesame for doing like one-time passwords that's amerex

project i think right yeah and that's that's pretty cool uh we had done a project in the

past where we tried to go passwordless you know where you basically put your email address get

the token over in your email come back over and like you're logged in uh email is still such a

mess it is just i wish that would work but it really was a headache now we still use it for

things like password resets or invitations. That's where we use Sesame. So it's really easy to send

out a couple thousand invites to people for an event. They click on it. It's already got them

logged in. They fill out their profile. They click accept and they're in. That user experience is so

smooth as opposed to a lot of the signup experiences you may get at different sites.

So using the combination of Sesame for getting the invites out and then Knox for the front end

having the token and we don't use any cookies in the loud swarm application at all it's all using

local storage with knock tokens so maybe we're a privacy first uh you know event platform but

there's no no tracking no cookies i mean there's tracking obviously going on when you watch a video

so we can know how many people are watching you know what videos and how long they watched for

and things like that but that's all done mostly server side or with the player and it's sort of

anonymous, right? It's not like, oh yeah, it's, it's Julie that's watching this video. It's just

a number of users. Yeah, it actually is anonymous. I don't believe we tie a specific user to what

their viewing was. We just want to know how many people were viewing a specific track or a specific

video. So you mentioned Postman. Um, is that your preferred client for consuming APIs? Cause there's,

I just saw in the orange site, there's a new open source one. There's PAW for Macs, which are, um,

just curious as a practitioner what do you like i do like postman um also i guess i'm kind of split

50 50 i'll use postman if it's real quick i need to like launch something but if i'm actually

actively developing on a project i'll be using i'm a pycharm user i like pycharm pro a lot and

the http client built into pycharm is pretty awesome because you can actually have it run

through and do full like save session state you can do some postman too you can save off variables

in postman and then like reuse them later in subsequent calls you could do the same thing

in pycharm but just as plain text and so instead of having to kind of drill through guis to to set

everything up you can actually set up variables and you know do the login get the token then reuse

the token to like now call these you know 10 calls afterwards and simulate a real set of transactions

back and forth between you and the server and you can actually do assertions so you can actually

make kind of unit test like bits in PyCharm to assert and actually Postman has this too they

both got similar sets of features but one's more kind of graphical if you're if you're not if you're

kind of intimidated by lots of text maybe you're new to it or you just want a quick and simple

you know way to do it Postman's way to go if you're really diving in deep and really want to

like you know check the stuff into like source control and and have it right alongside your

project uh the pycharm http client is really nice so it's called hopscotch two p's that's the new

one oh fancy well i well it has 38 000 stars on github so i don't know how new it is but it's an

open um open source api development ecosystem so i guess it's not that new but i mean a lot of times

i'm still using it i'm using http from the command line to yeah quickly you know ping against

something right right i mean it depends the tool for the job right yeah for sure what about you

carlton you're you're building but button you know you're doing apis i use i use i've been using

poor for which is the mac native one for years and years and it's the same kind of thing it's

got a request builder and you can um you know customize absolutely every aspect of the request

and you can view the request the response in different ways and you can you know extract a bit

and and then it will one thing i really like is at the bottom it's just got you know a drop down for

um expressing this in in whatever so curl or requests or httpx or swift or rust or you know

and and i find that really handy because okay you're sort of prototyping in poor

and then um you know cut and paste it's into into your project as working code um but yeah

they're all of these all of these tools have very similar capacities but that kind of um

http laboratory environment you kind of really need when you're developing against api because

the docs are never quite as good as you want them to be you know even like take stripe they've got

the best api you know they're not they're kind of great but you still have to make the request see

what comes back work out exactly what path it is you can't just go from the the api documentation

um so yeah i use ball but i have been doing for a very long long time but um yeah they're all great

i'm gonna go try the pie charm one next i didn't realize it was built in there's so many things

yeah i never you're always discovering well if you go like the it's just in the new menu like

new hdp client and then i could get a blank editor and it gives you some samples or you can paste in

a curl and it'll convert it right into the code for you it's kind of cool okay i'm gonna give that

a try that sounds good so can i ask the flash you you said the flash um the the tech on flash was

built from what you've been working on and loud swarm so is that built around channels and as

well or is it more the the serverless deployment and that it's more yeah more the serverless

deployment aspect uh that specific application i don't think it uses web sockets yet uh it's

mostly an api consumed by their customers so they sell it as an api and they give people access keys

and there's some minimal management there's no real fancy front-end application around the front

of it it's not a consumer app it's really a business-to-business type application the pieces

we leveraged from loud swarm though were definitely the fargate deployment technologies and techniques

this project goes a little step further in that we're also using lambdas i don't think

lots lots from does use a couple lambdas here and there but it's not in the main flow of the

application where this one actually is in the main flow we have lambdas listening to sns topics which

are just basically messaging coming from amazon for weather data so the national weather service

the noah's has a bucket a public bucket full of weather data which is really again so cool when

you start exploring some of these areas that there's like this open data i mean just like

terabytes upon petabytes upon petabytes of weather data are available to you as a human being on this

planet it's all open and free and accessible via apis or just it's just sitting in an s3 bucket so

we we take those notifications there's a real-time feed and an archive feed and i say archive in air

quotes because it's like every 10 minutes is the archive feed and the real time yeah yeah it really

in the real-time feed is a few times a minute you get uh the radar sweeps that are that are coming

through so we actually currently go off the i believe the archive data because if you go out

the whole technology problems behind the real-time part of it but with that archive data we now have

a lambda that can go grab the data out of the bucket pre-process do some some level pre-processing

against it throws the results into a redis and then that's available immediately for the django

api now to consume and reuse and republish back out so a lot of these things is about how do how

could we speed up that prediction process when we were first handed that notebook to take a archive

weather file and process it into the model needed to then do predictions against uh we were talking

in the order of like minutes i can't remember maybe in the case study we said how many minutes

but it was it was not a fast process for them to process it we got that down to uh in seconds now

you know using lambdas be able to scale that out changing up some of the you know algorithm to be

a little more efficient and it's a great combination having like scientists working

with real software developers because scientists have amazing ideas and when they can explain it

to a developer who can truly take that idea and accelerate it on the web their eyes just light up

like crazy like i can't believe this can actually happen so fast or we can deploy it to you know

into production and be able to use it so that's really fun it's like i love that part of the

experience but it's also a software developer's dream right is that you know you read all these

books on software development it's like well you need to talk to the domain expert you need to kind

of model the domain and you need to you know create the the the the programming structures

that map to the domain and you never do all this and then you do do it and it's like ah yeah this

is why i'm programming that's how it works it's beautiful uh and then then wrapping all those

trimmings around it so we use a lot of the cloud native tools in amazon for this project uh whether

They were deploying the main library.

We actually extracted the part of the notebook

that was like the important bits

into its own dedicated Python library

that we deploy into CodeArtifact

that is now used in the building

of the couple containers we use.

So we have a container that runs the Django bits,

which is the API.

And then we actually deploy a container for the Lambdas.

A year and a half ago,

Amazon announced support for Docker images

as a runtime for Lambda,

which is we have a really good blog post up on the Six Feet Up site

because it's not straightforward.

If you've got C dependencies, this starts to get a little tricky,

so packaging it up in a way that actually works.

If you just got a simple HoloWorld Python,

you just use the base Python image for Lambda and you're on your way.

But as soon as you've got something where you need to do some C compiling

and, for example, using NumPy and Pandas and things like that,

the more scientific type tools, data sciency tools,

that gets tricky.

So we documented all that,

put it up on the blog post for how to do that.

So we deploy our Lambdas as Docker images.

We build those Docker images using code artifact

where we deploy the wheels for our Python library.

So the same Python code is used in the Lambdas

that's used over in the Django's

and it's all deployed together using the code pipeline.

Actually, we aren't using code pipeline on this one.

We used GitLab.

Different customer had used the GitLab

this time and i actually like gitlab um it was kind of my first experience with using the ci

tools in it and i'm definitely a gitlab curious now and want to try out some more but we in the

past we've used code code pipeline and code build on amazon or we've used bitbucket pipelines

we're currently doing a project where we're doing azure devops pipelines so we've got a lot of

experience across all these different ones and i'll say the the gitlab ones are really nice um

really impressed with that one yeah that's good that's that's a nice little tip because you've

as you say you've experienced across the board i have one more question about the details of this

project then let will jump in perhaps but um how are you so you're dealing with numpy arrays and

perhaps pandas data frames um i'm wondering how do you serialize those into in the api i mean

you're using particular packages to help you get from the kind of um scientific python world into

the api world into something that's json serializable or are you right what are you doing

oh boy now you're making me dig deep into my brain there on it and what we did now i will tell you

that uh if you are going to try and use redis outside of django and basically have them both

share the same cache it is possible but you will you'll stub your toe hard uh probably the first

time you do because the settings for django's redis which is where we're basically doing these

kind of move this data i think we don't store the data frames in a in a more native way than

i think taking them to json data putting that in the redis cache and being able to serve it back

out through the the api but there is a setting you'll have to be aware of on the redis cache

side which is to allow it to be used from other tools otherwise it jango's redis cache functionality

starts kind of doing some nativey things that aren't compatible with like just getting a key

and setting it in redis okay but you come you come straight into json and then you serve that json as

it comes because that's the that's the sort of tricky question yeah for now i and i can't remember

we were also looking at my gosh i can't remember if it was webpack or one of the other like more

binary ways yeah message back i think the need wasn't there yet and it's obviously easier to

keep it uh non-binary until we you know we don't have until the performance is a problem and the

performance there wasn't the problem the performance was really in the in the data frames and the the

pandas you know work against the numpy arrays because there's some big arrays when you think

about data weather data and i didn't know much about weather data until we started on this

project uh but there are just tremendous amounts of data that are generated every second of every

day uh for the atmosphere and all across the whole like well we're only dealing with the united

states right now but basically you can split the whole united states into this giant grid

and then you get layered data for the various you know altitudes in the atmosphere and you get all

that i mean now you got to figure out what to do with it yeah how to use it well one of the cool

things with weather i was sharing with some friends recently is there's these interactive

wind maps both for the u.s and the world where it's same thing they're getting you know wind

measurements and you can see i think they're using d3 or something like that you know javascript

front end so you can see you know more or less real time what the wind's doing um and so it's a

very sort of accessible way to see weather and realize yeah there's sensors picking up everything

But as you say, the question is, what do I do with it?

You know, how do I compute it and how do I present it?

That's kind of the hard.

Oh, and the lightning networks are even more interesting because they've got lightning sensors that can detect very accurately where lightning is striking.

And so we actually use that to feed back into the system and do almost like unit tests.

You know, did the predictions predict where lightning was going to be?

So we have a feed from the lightning.

providing providers into the system as well, that we periodically run tests against of the

predictions. That's super cool. That reminds me a little bit of, I was just saying there was an app,

Dark Sky, which was built in Albany, New York, which is near where I used to live.

I think Apple bought it, but it was one of the first. They did, yeah. Which is a little sad,

but it was one of the first to make predictions. And it was, again, it was using, that's how I

first knew, oh, there's all this government, you know, taxpayer funded data. Um, and nobody had

thought to do predictions around it. I think hopefully Apple's bringing that into the default

Apple weather app. So I don't mean to necessarily slag on Apple. I'm sure it was a good exit for

them, but it was very cool. They haven't killed it yet. Yeah. I mean, hopefully they're pulling

it into the core weather thing, but I remember it was, um, cause I almost, I almost interviewed

with them. It was, I think it was like six or six or seven recent college grads who, you know,

had that insight and took the time to build it.

And there's these predictive models based on existing weather data.

That's really interesting.

And, you know, as you said, lightning strikes,

there's so many consumer business to business applications of that,

especially if it is truly accurate and you're testing it to confirm that.

That makes a lot of sense.

Yeah.

It's just interesting to have the extra validation layer that we have data

now about what we said was going to happen.

Did, if it did happen or not.

Yeah.

That's not so much science.

I know, it's like, it is crazy.

Science experiments driven by Python.

And actually repeatable, unlike, you know, half of all papers in science these days.

I looked up, so when I was mentioning, so it's drug molecules where there's millions and, like, millions and millions and millions of molecules around specific compounds.

So I know specifically, like, Harvard is building its own data set.

All these different places are building.

So they're private, but they're looking for ways to, you know, license it to pharmaceutical companies or this, that, and the other thing.

And so the structure isn't that crazy.

I mean, they basically need to put a database onto the web, but, like, lock down, you know, proper restrictions.

And they have, you know, good developers in-house, but it would be better if there was like an agent.

I'm sure you could have an agency that just took scientific databases and put them on the web as, you know, B2B APIs.

Anyways, lots of science stuff happening.

Sounds like a new startup for us all to go do.

Yeah, I mean, you know, there's another one here.

It's actually a big Django shop, Path AI, that's doing pathology with AI, and they're

testing it against actual pathologists, so they're building their model, and they're

going to have a database that they sell eventually and use.

So variations of that are, yeah, especially around here in Boston.

It's not so much web-first, but it's very heavily Python is being used science-first,

which is kind of interesting.

Yeah, it's awesome.

I did want to...

I love...

Just, Carlton, go ahead.

I just wanted to ask you about, we've kind of half touched on it, but as experienced Django folks, what do we think on, what could we offer to folks who are like, I've got some science data I want to get on the road.

like what what are we missing like what's the the the if we say if we took where we are now and we

said no but now if we've got you know manage.py do some science what would that do that you know

what's the what's the the thing that we could offer the scientific community the jango extensions

yeah well what what's you know yeah yeah that's that's what i was saying like you know if they

have they just have a database built in something or other with let's say drug molecules and you

know they don't need quite drag and drop but you just need to put that on the web and lock it down

and have control access there's well and that's that's the the hard part right i mean you see a

lot of really interesting no code low code platforms out there today but when you kind of

do more than just the hello world or default kind of path in those tools i feel like they fall down

a little bit when it comes to real permissions uh you start doing a lot of like putting stuffing

code into strange places to basically simulate a good authorization system in the tool i feel

that tools like django with the middleware and the ability to have the that control at the views

you have much easier way of expressing authorization than you do in some of these

like low-code type tools so would it be like a starter project or something like you know here's

you know here's an ap here's here's connect it to your data set be that coming from r or pandas or

whatever um here's a web page here's an api for it here's some more is that yeah that's what we've

got to offer you the drf tutorial is pretty good i mean you you can go in and you know there's like

logged in version or anonymous public versions of the apis i think it needs to go a step further

there maybe show more granular roles than just the logged in you know mix in type type typical

views you see i have to plug my book because i just added a ton to django for apis so there's

a lot more in permissions oh here's our answer right yeah it's well it doesn't answer everything

but i it's about 50 longer and it goes much more into production deployments and kind of all these

things so um that's there i was just thinking well that's what's needed yeah i was just thinking

that if there's any listeners who are working, because I know like Harvard Medical School uses

Django, like who work in these capacities and want to come on the show, like email us. We'd

love to have you on and talk about this problem because it's something I'm surrounded by here in

Boston and I don't have any particular insight on what those challenges are. You know, I suspect

maybe it's almost administrative more than technical, but I don't know. Yeah. Defining

those rules as opposed to how you implement those rules. Well, I mean, because you have scientists

who are, you know, paid to create things and they create them. And then it's sort of like this

database, but there isn't necessarily someone whose main goal is how do we share this? I mean,

they want to, they're not just sharing it. It's not government, but they may not be thinking,

how do we monetize this? Um, or if they are, you know, it's like a custom thing with a lot of

times, you know, they're selling it to pharmaceutical companies who use it for their

research. Um, anyways, if a listener wants to come on, I was going to ask you though, Calvin,

about training employees because you're doing such cool stuff and you run an agency. People

come in at different levels. In-house, you must have some version of scaling people up. What does

that look like? It's tricky. With being an agency, you almost have to have more senior talent. We

really don't have a junior pipeline in place just because the way we work with companies is not the

way like a staffing company would work with companies we're not putting people into to seats

necessarily or just providing them with talent sitting in a junior role people come to us really

to solve hard problems so our our staff is really focused on heavily senior people now that's not to

say that we don't want to make sure that they're always staying up to date and not kind of just

sitting back and using you know what always worked type things we encourage our folks to go to

conferences. So I know we've got a big group, a few of us are going to PyCon. I know a lot of us

are going to go to DjangoCon this year. And that's a place where we learn and connect with the

community. A lot of the learning happens in the hallway still, where you're, you know, running up

against some folks and saying, you know, hey, I got this problem. And man, it's amazing how many

problems are solved, you know, just in a quick chat in the hallway, or even in a Slack channel

for those things. And then once a week, we do a code review with our team. And I say code review,

it's really a show and tell it's really much more I find it more fun than a code review if someone

had a code review type problem we would we would talk about code review things but most of the time

it's let me show you some cool tool I'm playing with or some new library I'm using on this project

or some new technique or some new framework or even some old thing that I've kind of brought

back from the past and reused on this interesting problem in a different way and that that helps

keep everyone up to date because I can't tell you how often I see like a wheel reinvented

only because someone wasn't aware of a technique

we've already used to solve that same problem.

That actually happened this week.

We're doing a lot of Airflow work recently.

And so there's some new people

who aren't as familiar with Airflow,

which if you're not familiar with Airflow,

it's pretty even cool and it's written in Django.

And it's used for orchestrating

typically like data pipelines,

although I've seen customers of ours use it in other ways.

So having a wealth of knowledge now

that we can call upon to let new people join projects

who were not familiar with Airflow,

But to speed them up or ramp them up quickly is mostly through just how we can share knowledge

amongst the team.

So we have a daily stand up, you know, 15, 20 minutes where even though everyone's on

possibly different projects from different companies and different walks of life type

thing, they all talk about what maybe there is, if there's a blocker or if someone else

in the group who may not even be on their project can help them.

It really makes that transfer of knowledge a lot easier and a lot quicker.

So people aren't stuck and not spinning their wheels and wasting time.

So that's a lot of what we do.

Yeah.

I mean, it's a continuing problem for everyone.

Oh yeah.

I mean, I love learning.

I mean, I'm just an eternal lifelong learner and I just have a huge passion for solving

crazy fun problems.

And so the more I can spread that to my people who are on the team, the better.

So that's, is that Apache Airflow?

That's the tool you're referring to?

Yeah.

Okay. Yeah. I just didn't think, I wonder if it'd be fun to have someone from there come on,

if it's written in Django. Oh, you totally should. Uh, the, so the, a lot of the open

source contributions are coming from a company called Astronomer. Oh, that's where, um, that's,

that's where, uh, Andrew got Andrew is Andrew Godwin. Godwin. That makes sense. Yeah. He's

there. Uh, we should have Andrew on. We need to have him on anyways. So we'll just, we'll just

have Andrew. It's a neat tool. I mean, really it's basically, if you took, it uses Celery,

under the covers for certain things, or it can actually swap out Celery for like Kubernetes

workers. But it's a much fancier, well-dressed version of Celery. Like if I've got tasks that

have dependencies and I need, got resource constraints, and I want to make sure certain

things run and other things don't run. And it's that whole orchestration of running either a data

pipeline, you're moving, you know, petabytes of data across the pipeline each day, or you've got

some kind of process that you need to ensure runs in a specific flow airflow is your boy well i don't

i don't know if i have any more questions i've learned a ton carlton do you have any i mean we're

close on time anything else spring to mind i'm i'm sort of uh no i'm i'm out i'm i'm you got your

homework last i'm thinking about everything that calvin said so that that means we've must be coming

up to the limit because i'm sort of like wowed by i was just thinking about that what you're talking

about the team um actually that that's what really struck me is that um the thing with

one of the big topics always interest me is how you grow how you stay in software development for

the long haul and you know to have that continually learning thing is what keeps it fresh for me and

to have a company and build around that as a as a part of the company culture i think that's

fantastic so i'm just you know just giving sending loads of hearts your way oh i mean we we i was

to say we built a place i wanted to go work every day yeah super i would love to see a jenga con talk

on flash if you're able to do it i mean because my you know maybe where i'm at in my career my

favorite jenga con talks are you know the mental health ones and then the ones that are like we

had this really hard problem and here's the tools we threw at it and it's not necessarily canonical

but you know like jenga rest knocks like i remember really diving into that and learning

about that a couple years ago but i didn't have examples of it used being in the wild so i was

like i don't know like it seems really cool that makes sense but i just didn't know of

teams using it um so now i'm i'm gonna go poke again at it because i was just you know so that's

great that spurs that in me hearing that i will be giving a talk on flash at python web conference

yeah there you go me and the the lead developer from the flash team will be co-presenting this

case study i believe i've proposed it to the django con uh call for papers if maybe it's not

even i don't think it's open it will be it will be pitched because it's a good it's a good one i

agree great okay and so to take us out just remind us when the python about the dates of the python

web conference again so march 21st to the 25th okay and uh what's the website we'll put it in

the show notes but go to pythonwebconf.com yep well thanks everything great calvin always a

pleasure thank you nice thank you so much for taking the time to come on again we'll just have

to slot you in every year you can tell us i'll make it i'll put on the calendar for next year

so we are jango chat.com chat jango on twitter and we'll see everyone next time bye-bye