← Back to Show Notes

Transcript: Dev Environments - Calvin Hendryx-Parker

Hi, welcome to another episode of Django chat podcast on the Django web framework. I'm Will

Vincent joined by Carlton Gibson. Hi, Carlton. Hello, Will. And we're very, very pleased

to welcome Calvin Hendricks Parker back on to the show. Welcome, Calvin. Hey, Will. Hey,

Carlton. How's it going? Very well. Thank you for coming back on, Calvin. I'm excited

to be here. This has turned into an annual thing. So lots of talk about Python webconf,

Django cons pycons um maybe like what's been new the last year for you right there's a lot of

things but what what's top of mind when someone asks you what have you done for the last year

professionally what are the highlights well the highlights professionally uh we've done a lot

recently with airflow so i'll be giving a talk this year at pycon on um scaling to thousands

of dags with airflow which is django under the covers which i feel is appropriate to bring up

on this podcast but it's a really really cool product for those of you who have not gotten into

orchestrating like your etl loads or i mean we use it for other things too we've actually used

it to orchestrate on manufacturing floor processes you know robotics like interesting

like kind of crossovers into real world physical things and then we do also use it for some

traditional data loading etl type activity as well can so i've used airflow a little bit the

thing i think is really exciting that i could perhaps get you to um riff about is the dags the

the the way you define the task dependencies and can you perhaps explain that for folks because i

think it's really interesting right so basically you can think of airflow as just a a really really

glorified cron or you know for those of you in the django community i mean it's like celery you

can do asynchronous tasks and have it run these various things in in some sequence that sequence

though is super customizable with python and so you can define task a b c d in some order or those

can branch now the whole thing with a dag it's a directed acyclical graph which means it can't

cycle back onto itself there's no loops in a dag technically so you're actually going to be making

decisions about what paths to follow and like there can be failure situations and recoveries

and actually run certain tasks in parallel so you may have a thing that starts but the next step

may be highly paralyzable so you can actually take it split it up into a hundred separate processes

that can all run simultaneously and then as they all finish they kind of report back in

and they can kind of come back together and run the next set of steps and things that are on there

and there's a couple ways you can actually define DAGs you can use just plain straight python put

it in the DAGs folder inside of your Airflow instance and it'll pick them up it looks every

you know so many seconds to refresh that folder to see if there's more DAGs that have been dropped

in there you can also define the dynamic DAGs so DAGs that are like DAG factories so it basically

can dynamically instantiate and generate you know hundreds well I'll say hundreds at this point

because thousands really isn't realistic that's actually a limitation we ran into with the big

project we were working on and that's what I'm talking about at PyCon this year is the fact that

And as those things scale up in dynamic DAGs, you run into timing issues that the scheduler looks for new DAGs, the dynamic DAG starts building a dynamic set of DAGs, but if those two times start meeting, that's when you run into real trouble because basically the scheduler starts looking again and you're not done generating dynamic DAGs, especially if you've got an order of thousands.

And that was the case we had, which was tens of thousands of DAGs in a single Airflow instance.

And I told this to some people who were in the data world, and they were like, that can't be done in Airflow.

I go, we did it.

There's an interesting workaround.

And if you want to post a link to the blog post on our site, we actually posted how we did it.

I mean, there's no trade secrets here.

We want to help the community.

But it's pretty cool, and that's what I'll be talking about and just kind of giving some insights into.

There's some cool workarounds with Python that actually helped in this case where we actually have one.

we still have one DAG but we aren't using the dynamic DAG factories we actually have a process

that comes through and writes out names of files that are all symlinks back to that single file

and based on the name of the file we basically substitute in you know new configuration new

sources of data and actually can now scale this to still tens of thousands because the the reading

in of single python files on the file system is fast that that part's super super fast so it's

really not an issue it's when you get into the dynamic DAGs and trying to generate thousands of

them in memory and dynamically it slows down considerably that's really cool can you talk

about um how you like the progression to getting up to Airflow was this like project where did you

start with Celery and then you went to Airflow like how do you reach for Airflow that's always

my kind of question I guess it depends on the problem space in this case the the customer was

using another uh they were using a different orchestration stack called it was a powershell

windows you know orchestration tool that was kind of a glorify getting glorified cron but they had

bent it in in awkward ways that were very hard to maintain and they were also looking to basically

get in line with best practices for big data and which seems to be you know either using python or

you know there's a few other kind of common java a couple languages that are in there but they

wanted to be on python though like they think it's going to be the easiest way to onboard new data

engineers you know people coming out of school people who can jump in and actually be productive

python's an awesome choice obviously so that's that was the reason they came to us was really

to actually uh first review some of their python notebooks they'd done some you know jupiter

notebooks that were running in databricks as part of the etl process and they wanted us to code

review them see if they're up for best practices and as we kind of started digging in and seeing

what was actually going on

inside their whole architecture,

we realized really quickly,

like this can be done

with some open source tools.

They probably can actually

minimize their reliance

on some proprietary licenses.

So spend more money

on their own people,

getting them up to speed

on Python and notebooks

and things like that,

as opposed to spending money

on licenses for big

proprietary pieces of software

that are actually hard to configure

and use in an automated fashion.

Those are a lot of point and click,

drag and drop,

you know, kind of analystware

type tools where they wanted to be able to check this in have it be all automated have a cicd

process actually do continuous integration have tests on their configuration files so that they

can detect problems before they even go into the dev environment i think something you mentioned

there was about the hiring process because if you use like you know so there's that paper about

using boring technology and you know django is always a boring technology or postgres is a boring

But, you know, Django is cool. Postgres is cool. But in this sense, they're boring in that they're known commodities. And Airflow, I think, actually has made its way into that category now. It's a standard tool that's reliable, known by the community. And if you're using that tool, A, you can have less dependencies, less sort of niche bits, which are custom and bespoke and need special knowledge to run.

but you can hire someone new you can say hey we're using jangle we're using postgres or we're using

airflow i think that's right super powerful i do too and i think it's worthwhile to link to that

article that blog post i i go back and read it every couple years just to remind myself that

all the new shiny things aren't you know always there can be trouble in those waters uh it may

look like greener pastures but it's the boring tools get the job done well now i i believe in

instead of doing bleeding edge i do believe in like leading edge like i feel like we want to

stay on top of what is the best practices and django absolutely sits there in that leading

edge still of best practices is boring but it's productive and it gets the job done and it can

scale and it can do small things and it can do big things and then people have taken it to build

awesome tools like airflow i've made an analogy with surfing at times it's surfing you you don't

want to be in front of the wave because you know you're paddling that's a lot of hard work and if

you're behind the wave well you know you sort of just stop going anyway you've got to be exactly

on the wave right yeah yeah it makes sense well they i mean they also the version of that i mean

because silicon valley loves that analogy is you know you can be the best surfer in the world but

you gotta wait you gotta be on the right wave right because it's frustrating you see someone

else you're like they suck at surfing and they're just like killing it and it's like well

gotta find a way they got a malibu longboard and they're just going

what's with that guy can i can i ask you about the psychology of the the catnip the the new

and because it's part of it as a programmer it's like you always want the new tool you always want

to play with the latest this the latest what's your thought there because how do you resist that

in well and it doesn't affect everybody i mean there are probably 90 of the developers in the

world who are sitting happily doing the work they're doing day in day out and maybe not even

looking to the side like what's going on but not that's not me that's not a lot of people i hang

out with so maybe you just kind of you're always you know birds of a feather flock together and

maybe we're all just hanging out together carlton because that's what we're attracted to is like

you're a consultant right you're a consultant too i feel like consulting you're sort of

interested in a lot of variety in the new thing anyways right like it's not

right like you get airdropped into cool stuff you're not maintaining like my iteration questions

assume that you know you built it from scratch whereas you you get brought in when they're like

oh things are broken like you kind of want a consultant to be up on the shiny new thing and

to have a take yeah no that's true i mean people do rely on us to kind of vet out those shiny things

to make sure that they're not making a mistake you know down you know a couple years down the

road because everyone would love to be able to build a piece of software and just launch it and

be like okay that's cool we're done like it's awesome like but that's not true like you build

software and then you have to maintain that software you're now jumping on the treadmill

with your your product and had to you know you had to keep putting work into it to make it

still work and day in and day out now the back to the kind of shiny question i think we do a lot of

in-house like uh show and shares like you know kind of like kindergarten like bring bring your

favorite toy to school and show it off so a lot of every developer kind of gets an opportunity now to

show off or look at some cool stuff and then everyone gets to see it and talk about it and

discuss like what their thoughts are on it so we've gone back and forth a lot on especially

javascript technologies because we're historically been more of a python shop and that's you know

things we've always done have been python but there's no ignoring the importance of javascript

and now typescript in our our day-to-day work so understanding what's our approach and like what

do we care about in these various technologies how do we want to actually like you know code

code for them put in linters like you know like where are guardrails for doing a new a new piece

technology those typically come out in those kind of text show and shares yeah and i think in

javascript managing the the tech stack is all the more difficult all the more like pressing it is

tricky i would i would definitely agree with that i think that's i mean that's great to have to have

that internally because that is i think in any field you know if you're like a musician who

plays you know rock and roll you probably want to do jazz right like if you're doing some big

project in programming it's nice to just spin up a greenfield thing like you need to keep that

beginner's mindset and that playfulness which you can't always maintain if you're only working all

the time and something does i mean i remember when i was you know in a previous life i was a

book editor so all i did was read but reading with an ad like as an eye to something as opposed to

reading for fun you're like why the hell would i want to read for fun when all i do is read but

like you lose something so like building that into your company i think is you know really important

because otherwise you just get stale and yeah well there's a certain passion people have to

have for the craft that they're doing actually that's one of the things we look for in a

developer who would join our team is going to be a profile we call a craftsman they've actually

renamed it since then but we use a couple assessments and tools to look at the whole

person that we're bringing in to hire or interview and one of those criteria is there's some

indicators and markers from some specific assessments they're not the end-all be-all

whether we'll hire a person or even interview a person or not they're just one more piece of data

point that we can actually look at and knowing that they're in that craftsman you know arena

means a lot because they've got just kind of certain motivators in their life about i'm excited

about technology or i'm excited about the craft of it or i want to build you know as opposed to

other people who have different skills and different like passions and they suit better

into different spots can i just ask one more question while you're talking about hiring then

and teams and craftsmanship um do you do you have like um a kind of standardized tool chains and

standardized processes so that people can switch between projects and that you know the tooling's

the same or we are absolutely working on that yes uh so we there's some of it depends on the client

because we are consultants and so some clients have maybe they're on jira and we want to use

you know bitbucket or utrack or gitlab or some other tool so and one of the things we're working

right now is really the developer experience at six feet up and taking that developer experience

and then applying it to the client in a way that they see a lot of value in it that they may adopt

that best practice we think we feel like is a best practice in software development life cycle

so absolutely being able to use well everyone kind of picks their own ide but things like

pre-commit hooks you know having those set up so that everyone's running black everyone's running

isort everyone's running you know prettify or prettier on javascript so that those common

guardrails happen before they even hit the ci pipeline that helps like get everyone in the same

you know rowing in the same direction when it comes to their development tools and stacks they

can use whatever id they want but the code they're going to submit is going to kind of comply to some

internal standards and then using tools like we're really rapidly adopting kubernetes and we've had

adapted you know containers because that actually leveled the playing field for folks to be on a

linux mac windows and actually be just as productive in any of those platforms now the

next step is actually adopting kubernetes so that the the process of deploying and developing almost

looks identical there's just you know different versions of the container you may be using for

development that has the dev tools installed or the pi dev d extension so you can do remote debugging

but when you're releasing that container into production,

that whole process, the whole stack,

doesn't look unfamiliar,

because it may look unfamiliar now

if you're developing locally on Docker Compose,

but releasing into Fargate or some kind of Kubernetes,

The developers are like, I know nothing about this deployment process.

There's zero attachment other than the fact there's a container that runs my code.

Sounds nice, Carlton, right? Simple deployment.

It does sound lovely. It sounds lovely.

I think the key bit there is keeping the dev and the deploy looking similar, looking the same.

We've always strived for our QA or staging environments to be identical or as close as possible to production.

But bringing that a step back further into the development environment

without it being so onerous to run,

I think a lot of the issues before is if you wanted to run a full stack of stuff locally

and you were doing a Django app,

but you wanted to be able to have the Nginx and load balancing

and the Redis cluster and the Postgres cluster

and all these other kinds of pieces running,

well, you either had to install them from like Homebrew onto your Mac

and then maybe you had a slightly different version that was within QA

or there's always these little rooms for strange edge cases to sneak in because of that.

But if you start using these Kubernetes manifests, you're using the same manifest, but just defining different environments.

So there's the same versions of all the containers being run, same versions of the database, same versions of Redis, same versions of everything.

Now, it seems to be a lot more reliable and repeatable to do those deployments.

Well, I want to ask you about your bootstrapping your local Python environment talk.

Actually, because Carlton and I were talking about this before you came on.

But it is true that like of my three books, the third one just uses Docker because I'm like, OK, like, you know, containers, you're all set.

But that's not that doesn't work for beginners or all edge cases, even though maybe in a more professional setting, it's just like just containerize it.

And that's what I get to at the end of my talk.

So go ahead, Carlton.

Well, no, no, no, because it's it's not necessarily for me.

It's not necessarily about professional versus amateur.

it's about sort of the kind of complexity of the team and the complexity of the stack so

fairness yeah you know if it's like a small startup you know five five five people in a team

you may not have the ops chops to take on something like kubernetes whereas if you've

got a 20 person team you want to have that ops person there who's making sure that those 20

people are all doing the same thing and it depends if you're just running just running python and

you know you've got a hosted um hosted postgres and you know maybe you're running redis that extra

ops complexity doesn't necessarily pay for itself at the small scale but if you've got a bigger team

then sure it i think it does i don't think it precludes you from using containers in either

case even for small small scripts we still use docker locally to run a single python script

because you can guarantee the version of python that's going to run on a guaranteed like user

space inside that container it just narrow it keeps everything so much more sane so i even even

now and this is one of the things i got again i got to at the end of my talk at pycon last year

was that docker was really the way forward i think for especially developers who are across

a heterogeneous environment of os's versions of os's versions of python how they got installed

where they can you know you don't know where that python came from when you type python the command

line i can know i know how to guarantee i know where that python comes from when you write you

know docker run my my image so so what's your answer to how do you bootstrap your python

environment sorry i want i want i want i want him to do the talk i just i just want to make okay

well just i've because i've i've had i've had this i've had this conversation a lot with readers

you know and so one of the one of the major things i hear from from people around docker is

you know you need a nice laptop to run it a lot of you know you need a lot of ram or more you know

so it is a barrier someone you know who doesn't have a lot of money who doesn't have a nice you

know fast laptop docker sometimes is a non-issue so um well that's that's one of the reasons we

moved to kubernetes uh which sounds odd to say because of yeah explain that the reason we've

started using tools like Kubernetes is actually if you're running Docker and Docker Compose you

can run a Docker instance remotely on some other machine and actually like you know from your local

machine control that through the Docker you know control plane or even compose now the issue as a

developer is you want to be able to bind mount your source code into that that container when

it's running so you can debug or change the code and have it auto you know hot reload like Django

or the react code have it hot reload for you that's that's almost impossible to do on a remote

machine if you're just using straight up docker and docker compose because of the whole bind mount

limitation now tools like scaffold and kubernetes changes that model i can actually now actually

scaffolds faster at syncing files into a container running container than docker is at bind mounting

because of the io complexities that docker brings to the table now they fix some of these performance

issues in docker it used to be terrible uh i don't know if you've ever you know developed kind of a

few years back on docker where you're bind mounting in there and changing or had a large project that

was bind mounted into a container but the io was terribly slow but you get around that if you start

using alternate tools like for example i mentioned scaffold which can synchronize it watches for file

system changes synchronizes just those changes into the container for you and then you get the

hot reload capability and now you have the ability to actually dev locally work on local files using

your ide and all your your standard developer tools but have those synchronized into anywhere

you could have a sidecar like small little you know intel tower sitting you know desk side or

or it could be a cheap you know droplet in digital ocean someplace that is actually running you know

micro micro kates or minikube or you know you've got some lightweight single node kubernetes option

for development and now you can spin up for the couple hours you may need a day to work on that

specific project for pennies uh like all the ram you want yeah i mean related there i've recently

become quite a fan of the code spaces integration into github because i've just taken to that

pressing um you know in a repo just pressing the full stop and it convert and it just flips over

to this github.dev and a kind of vs codey type thing loads up and then you think oh actually i

need i need a an actual um i need an actual computer here so you you just go oh load this

into a code space and then it refreshes and then you're like oh do you know what in the browser

it's a bit of a pain i'm going to switch to vs code on the desktop it's like that's quite nice

um and i've been i've been a skeptic for a long time because i i um i really love local development

for all the convenience and it makes developing on an ipad a reality like the ipad pro is now

with the new especially with the new uh stage manager uh windowing the in the in the difference

between a macbook pro and an ipad pro is becoming very very slim as far as the user experience and

usability sorry if you can hear my dog in the background there yeah we're sorry we like the

dog bring the dog on more often yeah he's quite cute but i i love that portability i want to

travel with just my ipad but i still want to be able to jump in and fix a problem if i need to

although my my leadership team would tell me i should stop that and actually rely just on my

team instead of me doing those kinds of things but i still love to tinker i still want to develop

and be able to do it from an ipad wherever you know hooking up a keyboard and i've got a mouse

and now i'm in code spaces or some remote uh compute environment yeah there's way fat way

well actually the ipad is dang fast but it's so much nicer to have a cluster of machines in a

cloud someplace that i can resume into so if i was at a coffee shop pick up go back to the room or

whatever and then set back down again like nothing nothing happened like i've just i'm back in the

flow again and it gets over that limitation of the you know the the ipad os where you can't run

you can't you can't really run a python interpreter or you can't run a grass compiler or you can't run

a you know anything where no it's all on some box somewhere in the you know ether sphere yeah

i think for developer experience that's i think it's a big advancement for let's kind of roll

back to those developers who are not the bright you know shiny you know lovers they want to just

come in do their work go home i think for those developers this is a great innovation for them is

that the people who do like the bright and shiny can set up the most amazing development environment

and now share it like a rubber stamp to everybody in the whole team and now they can all share in

those same productivity tools that the you know the one person is super passionate about getting

all the power tools in their environment their dot files are thousands of lines long because they

you know just love tricking out every aspect of it that might be me uh but i can now set this up

for everybody and be like check this out like here's during a show and share show them all the

cool new few superpowers that are available to them on their command line or in their ide or in

their editor and it just goes a long way because some of them would be like oh that's cool i would

use it if it was easy to set up but those things aren't always easy to set up but they can be

shared with others easily yeah so that dev container approach that's yeah well speaking

i wanted maybe we've already covered some of it but your pycon talk about bootstrapping i noticed

your command line you've got that nice little timer thing for commands like a little hourglass

I was, cause I mean, I'm not fully tricked out, but I, you know, it's fun to do, but

I was like, Ooh, I hadn't seen that before when you were doing some commands locally.

So, yeah.

So I used, I can't remember what phase of my computer that, so I'm again, I'm a tinker

hacker.

I love fiddling my, my laptop all the time for a while.

I was using the bullet train go command line, but that got, it went, uh, kind of unmaintained

and I finally graduated into a power line 10 K or I think it's what the current one

i'm on i've been really happy with it it's but it's infinitely configurable you can spend a

couple days you know running through dot files if folks are you know curious my dot files are on my

github so you can actually just go to github.com slash carbon hp slash dot files and see like the

my configuration i use for the the powerline 10k stuff and it's it's it's so nice i mean it's nice

to have all that extra data right there available to you especially the time which more valuable

than you think because as you are running commands and doing things throughout the day

you pick oh shoot how long did that thing take to run or you know what time was i working on x

because maybe i needed to jot it down into my like time sheet or whatever i can definitely scroll

back through my buffer my terminal now and see oh that was at like 10 15 and you know i took

10 minutes to run and just provides all that additional context especially the virtual

environments bit too like knowing what the system python might be or what the virtual environment

you have activated actually is can save you a lot of mistakes i don't know how people will go with

a prompt that just has the dollar sign like mind-boggling like well that's yeah i mean you

need at a minimum you need to have the get the get stuff to let you know if something's changed

well you say minute you say minimum but how many people have you ever seen like open up a terminal

and you're like oh boy we got some work to do here like they're just they're riding pretty stock

however i try really hard not to immediately judge them right i'm feeling judged well because

no because because no sorry no but it's you know it is it's like it's like site design right it's

like simon willison's site right it either like their terminal looks like that it's like they're

either simon or they're not you know that's immediately i'm like it's an extreme here

that's what i think all right wait wait anyway so i mean so your your talk though local python

is a challenge um i thought actually the best job i've seen of laying it out i did want to

ask you though you i think you punted a little bit on like how do you recommend people install

python because you mentioned there's like i 100 recommend pyam for installing a version of python

any other python can be is untrusted and possibly incomplete uh that's been a huge pet peeve of mine

the fact that you can install Ubuntu or be using an Ubuntu container and you won't have necessarily

pip available or you won't have some of the standard library modules will not be there

because of some opinions that the OS makes. And most likely they made those opinions to protect

the OS. They don't want you randomly pip installing something into the root global

environment, which is a good thing. But it means that you really need to be installing PyInv

so you can control which specific version of python you want and i recommend installing py

env with the virtual env extension so you can just from right right from one command line create the

virtual envs activate them and now when you go into those directories or out of those directories

it automatically activates and deactivates those virtual so i'm never typing activate i'm never

typing work on i'm always just relying on you know the the dot python version file that's in my

you know path that just picks up what i should be doing when i wherever i'm at because i'm moving

from project to project a lot that may not be common for most people but it's actually more

common than you think because you're probably developing side tools that you're using as part

of your course of your work and those should be in separate projects from like the main project

you're working on and they maybe have different requirements than the main project you're working

on you may have a different version requirements that you need to pin specifically for that side

tool as opposed to the more stable project that might be running on an lts version of a library

your project i like to in your talk you mentioned or you recommended um pipx for you know global

global things like black and stuff that you want yeah for folks who don't know pipx kind of takes

a lot of that headache out of the way and also i think you get more recent versions of some of

those pieces of software so you could brew install black or i sort i think those packages may be

available in homebrew but they're not going to be as up to date as if you pip it pip installed them

from pypi because those may be the latest versions maybe the homebrew maintainer hasn't gotten to

updating the the entry in the homebrew you know package catalog but pipx gives you the best of

all those worlds where i can now pipx install black it installs and it creates a virtual

environment behind the scenes in a dot directory that you don't see but then it also puts the tool

into your path correctly so you don't think about it you just type black and it's in your path it's

going to work it's going to you know always be pinned to the right versions that are used for

that version of black and you can install you can inject other dependencies in there for example

if you're using like the markdown tools and you want to install pigments into that virtual

environment it's it's very possible it's absolutely possible to now inject

the pigments library to that version of the markdown virtual environment and use it with

your python markdown command but have a different version of pigments you may use standalone from

the command line because they're in two separate environments so the case i have there is with

sphinx because you always need extra you know plugins and things with sphinx so that so there's

one environment that's got all the sphinx bits in and they can just be updated yeah yeah yeah and

I've been I've been checking in I've been using um shamewa for my dot file uh configuration

management so that that tool allows me to actually have a shamewa init bootstrapped command so

basically a text file design defines on the very first run of this dot file manager run these

commands so it basically goes installs homebrew installs pipx installs all my standard utilities

so that every command I expect to use now will be available and that's a big complaint a lot of

had about customizing their dot files and customizing your shell is you log into some random

shell to debug something and half your stuff doesn't work so your muscle memory is all broken

on you know your keyboard you know memory is is missing but for me i log in i literally do shame

one knit calvin hp because it comes from my github it knows to look in github to get the dot files to

run that first bootstrap command and maybe it takes 30 seconds to a minute but everything now

it just works yeah that's that that not having a tool when you log into a strange box that's kind

of why i keep my my setup quite vanilla is that carlton there's no reason check out there is i'm

old that's speaking of speaking of being old carlton can we tease your pycon italia keynotes

because it's out there yeah yeah yeah you can so i've um i was invited to a keynote at pycon

italian i'm doing a keynote called um open source for the long haul which is you know

what about what it says um and but the the kind of the the relevance of being old is it's kind of

the um the the counterpart to the growing old gracefully talk i did five years ago when i

in heidelberg when i started fellowing and now i'm finishing fellowing and i'm doing a kind of talk

okay so and and yeah this is the the reflections it's not explicitly a reflection on that but as

i'm writing it it's a reflection on okay that's what i said five years ago what would i say now

so i'm quite excited about that i have to go watch that i'm quite interested i i've followed your

journey from the beginning as a fellow so curious to see what the the wrap-up is the post-mortem

you're not dead luckily yeah no no no no no no but you i mean part of it is you've got to be

able to step away before you become you know so um well segue to conferences python web conf is

coming up yeah in um a couple weeks march right yeah march uh 17th through the 23rd i think that's

yeah we'll have we'll have a link to that right i'll be in trouble but uh

Okay, we'll put a link. But this is, you know, speaking of watching things grow over time, I mean, this is really up and to the right.

Fifth year.

Fifth year, but I mean, the speakers and the number of people and everything is really impressive. I saw actually you did some videos. You did one with Al Swigart, but like, you know, even teasing out, you know, guests now. So it's really impressive to see.

There's more of those coming. I think tomorrow I'm actually interviewing Anthony Shaw, who will be one of our speakers from Microsoft. We've got two speakers from Australia who are joining us. I mean, that's the dedication for the excitement level that the speakers have about this conference is that they're willing to come at like three in the morning their time and give their talk live to the crowd. But it has grown a lot over the past five years.

We started this in 2019 as a virtual conference before it was hip to be virtual, and it will always be a virtual conference.

And one of the things I always love to tell people about this conference is it is really meant to be accessible to those who may not be able to make it to a PyCon or a regional conference of some kind.

We want to make sure that fills that gap.

And the other part is I want to make sure it fills in where I feel like some other conferences may be lighter on the intermediate content.

I think there's definitely a space for like the PyCons and the regional conferences to handle a lot of those newer intro talks to get people spun up into the Python community in a nice, gentle way.

We're really meant to bring some heavy hitting talks, not that I would dissuade you from coming if you don't know anything about Python, but you're going to be in a more advanced group of speakers and guests and people who are attending the conference.

It's really a lot of very professional folks who are doing a lot of cool things with Python.

I had a kind of intro talk with or a conversation with one of our speakers yesterday.

They're talking about the state of fusion.

So those of you who are kind of following the fusion story right now that we finally had in the last couple of weeks,

the first time that more energy was produced by fusion than to make the fusion process happen,

we have a speaker who's actually going to talk about where Python's playing a role in that science.

so i'm very excited they're they're super passionate about you know obviously this

topic i know very little about fusion so i'm looking forward to hearing about this

this wasn't the speed ups in python 3.11 no no no no no i don't think they use it in that

level where the performance actually matters they've got other things they're doing but

when it comes to data and accessibility and getting people onboarded to be able to process

all the data and actually come up with new insights into what's going on in the science

because it just provides people with so many

superpowers. I mean the batteries included

and then you layer on top of that like pandas

and matplotlib and all these

great libraries that just give people

superpowers without having to do much

work to kind of get to that

point of having all those things at your fingertips

is really a powerful story

for Python.

So that's what you'll come to

expect to see at Python WebConf this year.

We're going to be five days.

Half days East Coast time. We will have

a keynote speaker kick off every day and a

keynote speaker to end every day i really love hearing inspiring speakers so even though we are

a virtual conference we are having 10 keynote speakers as part of the the the standard program

and that will never change because i love having just big thinkers get up on the in our stage and

actually talk about the things that are going on in their lives and and those talks range from

technical to i've had some speakers who had barely ever used zoom before who are non-technical give

some amazingly moving talks.

So if you look last year's talks are all online.

There's a talk up there by Noel Musica

talking about the flower.

It'll bring you to tears,

but it'll move you to action

and whatever that passion may be inside of you.

And that's what I want to draw to people.

And mine just happens to be Python in the community.

And that's just where I get a lot of energy.

So I want everyone to join me for Python Web Conf.

Super.

I know Carlton, you know,

with all your post-fellow time.

Yeah.

2024, Carlton will be everywhere.

Oh, I don't know.

We were just talking about this.

Yeah.

Actually, you might appreciate this, Calvin.

I was sharing a quote about, you know, marketing, right?

Tech people famously don't like to do marketing.

And it was saying that, you know, it seems self-involved to market, but it's actually more self-involved to think just building it, people will come to it.

Because why are you so important that people care?

It's so true, though.

It's so many times people think that, you know, if you'll build it, they'll come.

How are they going to know about it?

Who are you that's going to bring them to it?

Right.

I mean, I, and I try to think about this like, like movies, right?

Like Tom Cruise, right?

He spends, I don't know, six months making a movie.

He'll spend a year marketing it like Tom Cruise, right?

Like it doesn't matter how big you are because they know what's up, right?

Like if you, but you know, the hard thing is you think all the time is spent on the

making and really it's that long tail of, of, uh, of marketing.

So anyways, I thought that was good. It's like, yeah, because it does feel a little icky to self-promote, but it's like, but if you don't self-promote, like, how does anyone know anything about you?

so yeah that's definitely been a problem in the open source community i think there's been a lot

of resistance to people making money um you know they feel like they should all be doing it for

just the good of the projects but the these projects like django wouldn't be around if there

wasn't money to fund the initial seed of the project or the the fellows to you know make sure

they can kind of steward the project into better directions or the the companies who are submitting

back the pull requests for bug fixes and security fixes because they needed them there's still money

involved and it's not a bad thing we shouldn't be upset about talking about that and we shouldn't

be upset about marketing yourselves as awesome developers out there but like eric mathis was um

he's putting out some more content he was like he said on um mastodon or fostered on whatever we

call it um i'm a bit shy about saying it's like no eric we want to know we really like this is

this is why we're here he does he does have a sub stack now i think it's mostly python so

right but even even you know even the the top python book author um yeah feel feels that way

so there's a lot of good people to emulate our model out there i mean simon is excellent i mean

at sharing every idea and thought he's having in his head and i love it i want to hear more

about the things he's doing and he shared all the time his his talk at um that he gave at jango con

this year um was part of it it was only one small part of it but was that the every commit needs to

have like the documentation with it and every project needs to have the blog post that goes

with it and read me front page the whole entry into the project is really important and if

and if you're not doing that bit it's not done right was his kind of line and i i did really

like that it's like yeah okay so i've got to write the docs here and i've got to actually tell people

about it well um jeff bezos amazon's famous for his executives like you kind of like write the

press release first and then you figure out how to do a new project so just that idea of like um

yeah working backwards right as opposed to like let's just jump in and figure it out it's like

you know of course which is what we all kind of do but it is helpful to be like and how are you

going to describe it to someone you know it's like well it's complicated it's all it's like

and they're just like zoning out immediately you know versus you know i think it is good

guardrails for yourself like what you know as we see like you know squirrel squirrel squirrel of

like interesting things we want to chase like what what what's the guardrails that we impose

on ourself for a project anyways it seems like there's been multiple attempts at like document

driven development um but they just never really caught uh there's it's that's a hard leap to make

to go directly from well i'm going to write the docs first and then i'll end up in tests and then

then i'll end up in the actual code that does it yeah doc test driven development yeah that's pretty

black belt level to be able to do that i've never i've never done it so

i mean well so i i occasionally will um write the test first but normally it's when i'm um

i've got like a complicated algorithm or complicated process and i've got if i just

wrote that little bit then that would be one step in it okay so i'll write a test for that

one little bit okay and then i can write the bit and then okay i'll write a test for that bit and

then and in that circumstance i'm writing tests i'm doing test driven development but very quite

often it's the other way around it's like that blitz out some code and now i'm going to write

some tests the other end i'm saying the docs normally the docs are right now i better write

some docs yeah yeah well calvin i wanted to ask you i was carlton i were also talking about um

testing in that i got a reader asked me a really interesting question which was

like she's is there a canonical encyclopedic guide to django tests um which i don't think i just i

wonder like internally do you have something like that because it like what is the point where it's

like okay like yeah you test your models views templates urls but like for a beginner it's hard

to know like what's core django what's new um you know where is that line where things become

custom um and i've you know i don't i have some testing tutorials but i don't have a great you

know pie chart of like here's where you need to make sure you have you have coverage other than

you know running coverage so i just curious internally how do you think about that because

i'm like yeah i would love if that existed and we could all agree you know like here's here's the

80 20 right other than like when there's a new page you should test you know test all the pieces

of it but um yeah right now that's really driven internally by the developers you know love of

testing it's going to determine how good the output may be um luckily on our side we've got

some great new developers who've joined the team recently that are very excited by you know test

driven development and so we do see more test coverage in our projects um and prior to that

it was just you know it's all new or people weren't excited about it but i think that has

to be a culture like you have to start bringing in folks who are who are excited they'll give the

demos they'll you'll see the test you know coverage creep in or start happening and with

ci pipelines when you have a failure and you get a nice little email saying you you broke the build

that that obviously is a nice peer peer pressure driven development that encourages you to go

fix and or add you know test coverage because maybe you've got a rule that the coverage has

to be x or that your commit can't go through do you have the rule that whoever broke the build

last is responsible for running it maintaining it we do i mean that but i keep seeing sometimes

broken builds stay for a couple days i'm like come on what's going on here let's get back in line

have to go have a conversation a couple people i think if i was the beginning thing targeting for

beginners i'd say like have a test at least on the sort of main functionality of every bit like

you know check that your home page loads or check that the login button actually submits or or

something and then if you run that with cai then if you break it you know that you broke it and

you're like okay and the other thing the other you know 100 coverage or whether that may be a

bit too much to add for but if you've got that kind of basic coverage over the core bits it's

like a smoke test that oh actually i broke something i think there's value in writing

those what seem like silly silly tests like you're like oh this is obvious this should always

obviously work well if it should always obviously work you might as well write the test for it so

but then when it obviously doesn't work, you'll know.

Yeah, yeah, yeah.

Yeah.

I, it's hard when someone, you know, again,

this is what I spend my time with someone who's so new,

like someone who was saying,

okay, because I have tests in all my books, but I can't write an entire book on Python while I'm

writing a book on Django. And they were like, okay, we're using setup test data. It's like,

what's this class method? And it's like, okay, well, what are classes? And then why am I using

CLS? And it's like, oh, PEP8. So it's this tension that I enjoy this challenge of trying to be like,

I want to explain things. I don't want to have magic everywhere, but how deep do you want to go?

sometimes just like, just trust me on this. This is a best practice. And like, you know,

here's some links if you want to knock yourself out, but it is, it is completely overwhelming

because you're just taking it on faith. It's like, oh, I copied these tests and it seems to work.

And I, I kind of understand why, but like, you should just put a little footnote, go see Brian

Hawkins book if you want to like dive in here, you know? Yeah, yeah, yeah, no, exactly. But,

you know, there is a tension there between the kind of person who wants to understand everything,

but also it's just turtles all the way down and so some i see that these people get really hung

up right and they have and sometimes it leads to like i'm just never going to grasp all this

and it's like you know it is fractal like you do need to put limits on stuff and just be like you

know have a list of like i'm going to go learn those things um but you know at the same time

like if someone was you know someone could read the python docs and like read about class methods

but there's no way they're going to remember that until they need to use it so it is kind of like

you kind of need to use it to learn it and then um anyways i have empathy for these things and

even for me too i'm kind of i'm like oh like oh yeah i'm like i know we use that like why do we

use that like pep 8 i was i was like i was like why the heck do we do this do it this way it's

like oh yeah it's a pep somewhere we have that same attention even internally on our team it's

like we're a team of senior developers who are continually wanting to learn new things but

there's no point in going and learning some new technology piece of technology if you're not going

to go apply it right away yeah because you'll just have to give six months when that project

does come around where that technology is actually warranted you're gonna have to go brush up on it

again anyway um and so a lot of our learning is really on the job hands-on when a project comes

about um you kind of we track and like a kind of a skills matrix or inventory of what people

obviously what they're good at but what they want to learn so that we can actually match them up

with some projects where they would want to actually expand on the new skill set that's

really wise because i've seen so many times a developer in a job who's a little bit bored

start going experimenting with some new tech because they're a little bit bored and the way

around that of course is to preempt it and know that they want to use that and assign them to the

the task so that's i think that's that's obviously a battle one learning yeah it's a free tip for the

day yeah well i think also um bugs like this is maybe less as a consultant but more when you're

internally working on something you just have this big long list of known bugs and figuring out how

do you how do you deal with that because you know you can't just fix bugs all day and have perfect

code because you're never going to ship new features but on the other hand if you never

fix any bugs developers are all gonna eventually leave and get frustrated so um i've said this

before but i think one really nice way is to have like bug squashing days you know every month every

whatever so that it puts the onus on the developers so it's like okay there's 100 bugs like you take

ownership for deciding what are the 10 you're going to handle and it's a way to kind of off

um you know it's like a escape valve for like yes we are we are fixing these things but not right

now because we're doing something else but it's not you know an indefinite like code is all crap

If you look on our site, over the years, we've done an activity internally called FedEx Day.

It's now called Ship a Day.

And actually, it's kind of more recently morphed into what we're calling Tech Debt Relief Week, where at least like the first week of January, everyone's kind of coming back from the new year and they've been off for like a week or two.

And we'll be like, we're not going to do any customer work this week.

We're just going to focus on tech debt relief or new process enhancement.

And so it just kind of came about that that was a great way to give people that outlet to clean up the little messes that have been made.

And sometimes we do it more than once a year, but usually once a year, we spend a few days not on billable work, but actually fixing up and scratching those itches and coming up with new innovations and ways we can improve our workflow.

It feels so good as a developer.

I mean, it's hard to describe to someone who doesn't do this kind of thing, but it's like a cleanliness thing.

right like oh there's nothing worse i mean for me picking up an old code base and we thought we

were doing best practices at the time and this is maybe a couple years back and looking at it and

going oh my gosh i got so much to clean up here first before i can even be productive like my

it's like having a clean desk before kind of getting started on some creative work

it's hard when you like you just you just see little like issues here and there and there and

there it distracts you from solving the actual problem yeah at least that's for me i assume most

folks were like that one one analogy i've seen for this kind of environment you know this kind

of discussion about tech debt versus shipping features is that you know okay if you just

should work on shipping features in the short run you'll ship more features but sort of velocity

falls over time and you know in the medium to long term you're actually going slower because

you didn't prioritize would you agree with that i would agree with that it's like the drag the

friction coefficient yeah or yeah well obviously in the beginning of a new product there's a lot

more new features to build too because there's nothing if nothing exists i mean we're we're

experiencing that with the loud swarm platform that we host python web conference on is we

actually pick it up in a couple times a year if folks are not busy on another project they

will get assigned to do some cool new stuff on loud swarm which is actually really fun

that's been a great platform for us to experiment with new technologies like a lot of the web sockets

work we've been doing a lot of the kubernetes you know advancements in our development developer

experience have all come out directly out of us innovating on the lads warm platform that's cool

i love that's all because you've hosted the jango cons for you know the whole of the copy period and

since as well but um i remember last time you came on you were talking about fargate and containers

on fargate and now you're talking about kubernetes so i see that we still use you can still use

fargate with kubernetes if you're using eks on amazon and you don't feel like you don't know

your capacity is going to be or you want to scale closer to zero it's possible actually to spin up

compute in fargate fargate works on ecs and eks uh just equally as well so you can actually define

your pods and your your resources to pull from fargate ad hoc serverless resources sorry a little

plug there for amazon fargate but i do find fargate to be a great option when you just aren't

sure how much resources you need or you don't want to be maintaining the ec2 instances that would be

hosting those resources yeah no perfect um yeah so many pieces so many moving parts to have a

playground to work on it and it helps a lot uh and it brings people up to speed on like cic pipelines

and terraform and configuring cloud front and all the like caching pipeline and you know how do you

split off web sockets into separate like pods of like resources that are servicing just those

kinds of requests like there was a point came up last week or a couple weeks ago where people

asking how did the front end get deployed like the ci pipeline would run and it would deploy

new fargate containers for the back end and there would magically be a new s3 bucket synchronized

with the latest front-end changes and people were like we don't know where how those get there

and so we had to go track it down and it's actually kind of clever is that we actually

had fargate tasks that when the fargate release would happen obviously new task definitions would

be generated fargate picks those up and then replaces the old containers for django but at

the same time it runs the django migrations it does the collect static and then it synchronizes

as a task that then runs and stops all the front end to the the front end of the s3 bucket so it

means anytime you release any release you're sure the front end always matches the back end

because they were tagged at the same time yeah lovely that's nice lovely um coming up on time

um our previous guest was um marius felicia we were talking about django 4.2 and 5.0 um so my

question to you is are there new features you're looking forward to and like what's your current

django wish list if you just you know make it so what what could django add that would improve

uh things i mean i think the the async story uh is is ever improving and i think that's really

important to django being a viable player out there still i think a lot of people may see start

to see django is getting along in the tooth and being a still a synchronous web framework and

obviously it's hard to kind of bolt that on and we've had a we've taken a very measured approach

to getting async capabilities into django and i think that's that's a high priority i think that

to make sure that Django and Python as a framework,

Python as a language

for people who are doing work in the web,

it has to be async.

So 4.2 brings in async support for streaming response,

which is if you pass it in async iterator

and you're running under ASCII,

you can do long polling or service end events

just with core Django.

I think that's quite exciting.

then with the psycho pg3 um i was talking to marish about this today i want he's like i'm like

you've got to put this example together for me marish where we use psycho pg3's async client to

listen to a notification that came either from a trigger or you know a save event or whatever

and then tell the listening client from a streaming response it needs to fetch the lady

so that kind of real-time updates all in native django and then so that giving streaming responses

um making a streaming response is async compatible that's kind of like the outer limit i think in

terms of features that will be in django core but now through 5.0 it's like okay can we flesh out

the decorator support for async can we you know just get async signals can we you know flesh out

some more stuff inside the orm can we you know these these richness of the async support now so

over the next over the 5.x cycle it should mature you know continue to mature and all those all

those gnarly bits will start to smooth out and give you that django experience that you want right

um for people to pick up and actually be productive they need that django batteries included

you know pythonic way of of working yeah i think i can't remember who we were talking it might be

michael kennedy who's saying that um uh you know python is a full stack language and you know it's

important for it to have async because you don't need you don't want to have to just jump to rust

or Go or whatever, just because you need high throughput.

It's the same with Django.

There'll always be specialist async frameworks

that are more advanced,

but you don't want to have to change your web framework

just because you need a streaming view

or just because you need that.

So it's coming on nicely, I think.

Well, and that's some of your post-fellow open source work

is around async as well.

Yeah, so I'm not leaving Django

by any stretch of the imagination.

It's given me a career and I've given back to it for this period.

But now I'm going to step back to the back benches.

But I'll continue maintaining the channels projects and have a bit more time for them in actual fact.

And I want to continue working on the async stuff in Django itself.

And then, you know, there's things around the request object in terms of modernizing that and bringing in JSON request passing that I want to add.

and then the really sort of next one is is to improve is to refresh django serialization

story so this is something that i want to work on and um well andrew godwin just released his

little api framework that's very fast api inspired that was in his um what's it called uh the

the mastodon client that's called something the name escapes me but he's released this hatchway

project as a yeah um proof of concept there and it's it's quite exciting because it takes

pydantic schemas and pulls them pulls the request straight into the request the request handler and

then it will serialize back out for using pydantic again that's that's where the future's got to be

that's where it's like okay whether it's pydantic or whether it's attas or whether it's you know

something else that giving django a better story or a fresh story there is something that i'm really

excited about over the 5.x um cycle speaking of airflow we should ask andrew to come on yes he

works he works on airflow um yeah we'll have to get it back on you should i agree well any is

there anything um we we didn't cover calvin i know no i think make sure you come to uh python

web conference you guys have there's a um a discount code that you all can share in the the

the show notes for everybody brilliant and i'll get you like 15 yeah come hang out with me for a

week it's a lot of fun yeah yeah but you don't have to watch all the videos live right you can

you can catch you can buffer them oh so they're available immediately after the talks uh which is

a beautiful part of like the loud swarm platform is that you can watch any video that's happened

already uh like within 10 minutes of the talk you can also pause and go get a drink and come back

and unpause it or come in rewind and be like yeah you can totally do that that's that's one of the

things i love i mean i didn't want to create this platform that just matched reality i wanted to

you know emphasize the features of being a fully virtual platform and one of those is definitely

like i missed the first five minutes to talk i'll just slide the slider back and watch the talk live

from five minutes ago yeah well i just yesterday actually i was doing something trying to explain

async to someone, which is rough because I don't fully understand async. And I was looking at

Carlton's talk again from DjangoCon, which I've seen twice. And I was like, oh, God. So I was

definitely rewinding and sliding. So it's very necessary to have that. Yeah, the struggle is

very real. Well, Calvin, thank you so much for coming on. We'll have links to everything.

Python WebConf, we'll have the discount codes. Really, I like this. We see you at DjangoCons

in between but to catch up on what's happened last year is really interesting yeah it's always

a great conversation thanks for having me thanks for coming on again coming so jango chat.com we

are on uh fostered on and we'll see everyone next time bye-bye bye-bye