Transcript: Dev Environments - Calvin Hendryx-Parker
Hi, welcome to another episode of Django chat podcast on the Django web framework. I'm Will
Vincent joined by Carlton Gibson. Hi, Carlton. Hello, Will. And we're very, very pleased
to welcome Calvin Hendricks Parker back on to the show. Welcome, Calvin. Hey, Will. Hey,
Carlton. How's it going? Very well. Thank you for coming back on, Calvin. I'm excited
to be here. This has turned into an annual thing. So lots of talk about Python webconf,
Django cons pycons um maybe like what's been new the last year for you right there's a lot of
things but what what's top of mind when someone asks you what have you done for the last year
professionally what are the highlights well the highlights professionally uh we've done a lot
recently with airflow so i'll be giving a talk this year at pycon on um scaling to thousands
of dags with airflow which is django under the covers which i feel is appropriate to bring up
on this podcast but it's a really really cool product for those of you who have not gotten into
orchestrating like your etl loads or i mean we use it for other things too we've actually used
it to orchestrate on manufacturing floor processes you know robotics like interesting
like kind of crossovers into real world physical things and then we do also use it for some
traditional data loading etl type activity as well can so i've used airflow a little bit the
thing i think is really exciting that i could perhaps get you to um riff about is the dags the
the the way you define the task dependencies and can you perhaps explain that for folks because i
think it's really interesting right so basically you can think of airflow as just a a really really
glorified cron or you know for those of you in the django community i mean it's like celery you
can do asynchronous tasks and have it run these various things in in some sequence that sequence
though is super customizable with python and so you can define task a b c d in some order or those
can branch now the whole thing with a dag it's a directed acyclical graph which means it can't
cycle back onto itself there's no loops in a dag technically so you're actually going to be making
decisions about what paths to follow and like there can be failure situations and recoveries
and actually run certain tasks in parallel so you may have a thing that starts but the next step
may be highly paralyzable so you can actually take it split it up into a hundred separate processes
that can all run simultaneously and then as they all finish they kind of report back in
and they can kind of come back together and run the next set of steps and things that are on there
and there's a couple ways you can actually define DAGs you can use just plain straight python put
it in the DAGs folder inside of your Airflow instance and it'll pick them up it looks every
you know so many seconds to refresh that folder to see if there's more DAGs that have been dropped
in there you can also define the dynamic DAGs so DAGs that are like DAG factories so it basically
can dynamically instantiate and generate you know hundreds well I'll say hundreds at this point
because thousands really isn't realistic that's actually a limitation we ran into with the big
project we were working on and that's what I'm talking about at PyCon this year is the fact that
And as those things scale up in dynamic DAGs, you run into timing issues that the scheduler looks for new DAGs, the dynamic DAG starts building a dynamic set of DAGs, but if those two times start meeting, that's when you run into real trouble because basically the scheduler starts looking again and you're not done generating dynamic DAGs, especially if you've got an order of thousands.
And that was the case we had, which was tens of thousands of DAGs in a single Airflow instance.
And I told this to some people who were in the data world, and they were like, that can't be done in Airflow.
I go, we did it.
There's an interesting workaround.
And if you want to post a link to the blog post on our site, we actually posted how we did it.
I mean, there's no trade secrets here.
We want to help the community.
But it's pretty cool, and that's what I'll be talking about and just kind of giving some insights into.
There's some cool workarounds with Python that actually helped in this case where we actually have one.
we still have one DAG but we aren't using the dynamic DAG factories we actually have a process
that comes through and writes out names of files that are all symlinks back to that single file
and based on the name of the file we basically substitute in you know new configuration new
sources of data and actually can now scale this to still tens of thousands because the the reading
in of single python files on the file system is fast that that part's super super fast so it's
really not an issue it's when you get into the dynamic DAGs and trying to generate thousands of
them in memory and dynamically it slows down considerably that's really cool can you talk
about um how you like the progression to getting up to Airflow was this like project where did you
start with Celery and then you went to Airflow like how do you reach for Airflow that's always
my kind of question I guess it depends on the problem space in this case the the customer was
using another uh they were using a different orchestration stack called it was a powershell
windows you know orchestration tool that was kind of a glorify getting glorified cron but they had
bent it in in awkward ways that were very hard to maintain and they were also looking to basically
get in line with best practices for big data and which seems to be you know either using python or
you know there's a few other kind of common java a couple languages that are in there but they
wanted to be on python though like they think it's going to be the easiest way to onboard new data
engineers you know people coming out of school people who can jump in and actually be productive
python's an awesome choice obviously so that's that was the reason they came to us was really
to actually uh first review some of their python notebooks they'd done some you know jupiter
notebooks that were running in databricks as part of the etl process and they wanted us to code
review them see if they're up for best practices and as we kind of started digging in and seeing
what was actually going on
inside their whole architecture,
we realized really quickly,
like this can be done
with some open source tools.
They probably can actually
minimize their reliance
on some proprietary licenses.
So spend more money
on their own people,
getting them up to speed
on Python and notebooks
and things like that,
as opposed to spending money
on licenses for big
proprietary pieces of software
that are actually hard to configure
and use in an automated fashion.
Those are a lot of point and click,
drag and drop,
you know, kind of analystware
type tools where they wanted to be able to check this in have it be all automated have a cicd
process actually do continuous integration have tests on their configuration files so that they
can detect problems before they even go into the dev environment i think something you mentioned
there was about the hiring process because if you use like you know so there's that paper about
using boring technology and you know django is always a boring technology or postgres is a boring
But, you know, Django is cool. Postgres is cool. But in this sense, they're boring in that they're known commodities. And Airflow, I think, actually has made its way into that category now. It's a standard tool that's reliable, known by the community. And if you're using that tool, A, you can have less dependencies, less sort of niche bits, which are custom and bespoke and need special knowledge to run.
but you can hire someone new you can say hey we're using jangle we're using postgres or we're using
airflow i think that's right super powerful i do too and i think it's worthwhile to link to that
article that blog post i i go back and read it every couple years just to remind myself that
all the new shiny things aren't you know always there can be trouble in those waters uh it may
look like greener pastures but it's the boring tools get the job done well now i i believe in
instead of doing bleeding edge i do believe in like leading edge like i feel like we want to
stay on top of what is the best practices and django absolutely sits there in that leading
edge still of best practices is boring but it's productive and it gets the job done and it can
scale and it can do small things and it can do big things and then people have taken it to build
awesome tools like airflow i've made an analogy with surfing at times it's surfing you you don't
want to be in front of the wave because you know you're paddling that's a lot of hard work and if
you're behind the wave well you know you sort of just stop going anyway you've got to be exactly
on the wave right yeah yeah it makes sense well they i mean they also the version of that i mean
because silicon valley loves that analogy is you know you can be the best surfer in the world but
you gotta wait you gotta be on the right wave right because it's frustrating you see someone
else you're like they suck at surfing and they're just like killing it and it's like well
gotta find a way they got a malibu longboard and they're just going
what's with that guy can i can i ask you about the psychology of the the catnip the the new
and because it's part of it as a programmer it's like you always want the new tool you always want
to play with the latest this the latest what's your thought there because how do you resist that
in well and it doesn't affect everybody i mean there are probably 90 of the developers in the
world who are sitting happily doing the work they're doing day in day out and maybe not even
looking to the side like what's going on but not that's not me that's not a lot of people i hang
out with so maybe you just kind of you're always you know birds of a feather flock together and
maybe we're all just hanging out together carlton because that's what we're attracted to is like
you're a consultant right you're a consultant too i feel like consulting you're sort of
interested in a lot of variety in the new thing anyways right like it's not
right like you get airdropped into cool stuff you're not maintaining like my iteration questions
assume that you know you built it from scratch whereas you you get brought in when they're like
oh things are broken like you kind of want a consultant to be up on the shiny new thing and
to have a take yeah no that's true i mean people do rely on us to kind of vet out those shiny things
to make sure that they're not making a mistake you know down you know a couple years down the
road because everyone would love to be able to build a piece of software and just launch it and
be like okay that's cool we're done like it's awesome like but that's not true like you build
software and then you have to maintain that software you're now jumping on the treadmill
with your your product and had to you know you had to keep putting work into it to make it
still work and day in and day out now the back to the kind of shiny question i think we do a lot of
in-house like uh show and shares like you know kind of like kindergarten like bring bring your
favorite toy to school and show it off so a lot of every developer kind of gets an opportunity now to
show off or look at some cool stuff and then everyone gets to see it and talk about it and
discuss like what their thoughts are on it so we've gone back and forth a lot on especially
javascript technologies because we're historically been more of a python shop and that's you know
things we've always done have been python but there's no ignoring the importance of javascript
and now typescript in our our day-to-day work so understanding what's our approach and like what
do we care about in these various technologies how do we want to actually like you know code
code for them put in linters like you know like where are guardrails for doing a new a new piece
technology those typically come out in those kind of text show and shares yeah and i think in
javascript managing the the tech stack is all the more difficult all the more like pressing it is
tricky i would i would definitely agree with that i think that's i mean that's great to have to have
that internally because that is i think in any field you know if you're like a musician who
plays you know rock and roll you probably want to do jazz right like if you're doing some big
project in programming it's nice to just spin up a greenfield thing like you need to keep that
beginner's mindset and that playfulness which you can't always maintain if you're only working all
the time and something does i mean i remember when i was you know in a previous life i was a
book editor so all i did was read but reading with an ad like as an eye to something as opposed to
reading for fun you're like why the hell would i want to read for fun when all i do is read but
like you lose something so like building that into your company i think is you know really important
because otherwise you just get stale and yeah well there's a certain passion people have to
have for the craft that they're doing actually that's one of the things we look for in a
developer who would join our team is going to be a profile we call a craftsman they've actually
renamed it since then but we use a couple assessments and tools to look at the whole
person that we're bringing in to hire or interview and one of those criteria is there's some
indicators and markers from some specific assessments they're not the end-all be-all
whether we'll hire a person or even interview a person or not they're just one more piece of data
point that we can actually look at and knowing that they're in that craftsman you know arena
means a lot because they've got just kind of certain motivators in their life about i'm excited
about technology or i'm excited about the craft of it or i want to build you know as opposed to
other people who have different skills and different like passions and they suit better
into different spots can i just ask one more question while you're talking about hiring then
and teams and craftsmanship um do you do you have like um a kind of standardized tool chains and
standardized processes so that people can switch between projects and that you know the tooling's
the same or we are absolutely working on that yes uh so we there's some of it depends on the client
because we are consultants and so some clients have maybe they're on jira and we want to use
you know bitbucket or utrack or gitlab or some other tool so and one of the things we're working
right now is really the developer experience at six feet up and taking that developer experience
and then applying it to the client in a way that they see a lot of value in it that they may adopt
that best practice we think we feel like is a best practice in software development life cycle
so absolutely being able to use well everyone kind of picks their own ide but things like
pre-commit hooks you know having those set up so that everyone's running black everyone's running
isort everyone's running you know prettify or prettier on javascript so that those common
guardrails happen before they even hit the ci pipeline that helps like get everyone in the same
you know rowing in the same direction when it comes to their development tools and stacks they
can use whatever id they want but the code they're going to submit is going to kind of comply to some
internal standards and then using tools like we're really rapidly adopting kubernetes and we've had
adapted you know containers because that actually leveled the playing field for folks to be on a
linux mac windows and actually be just as productive in any of those platforms now the
next step is actually adopting kubernetes so that the the process of deploying and developing almost
looks identical there's just you know different versions of the container you may be using for
development that has the dev tools installed or the pi dev d extension so you can do remote debugging
but when you're releasing that container into production,
that whole process, the whole stack,
doesn't look unfamiliar,
because it may look unfamiliar now
if you're developing locally on Docker Compose,
but releasing into Fargate or some kind of Kubernetes,
The developers are like, I know nothing about this deployment process.
There's zero attachment other than the fact there's a container that runs my code.
Sounds nice, Carlton, right? Simple deployment.
It does sound lovely. It sounds lovely.
I think the key bit there is keeping the dev and the deploy looking similar, looking the same.
We've always strived for our QA or staging environments to be identical or as close as possible to production.
But bringing that a step back further into the development environment
without it being so onerous to run,
I think a lot of the issues before is if you wanted to run a full stack of stuff locally
and you were doing a Django app,
but you wanted to be able to have the Nginx and load balancing
and the Redis cluster and the Postgres cluster
and all these other kinds of pieces running,
well, you either had to install them from like Homebrew onto your Mac
and then maybe you had a slightly different version that was within QA
or there's always these little rooms for strange edge cases to sneak in because of that.
But if you start using these Kubernetes manifests, you're using the same manifest, but just defining different environments.
So there's the same versions of all the containers being run, same versions of the database, same versions of Redis, same versions of everything.
Now, it seems to be a lot more reliable and repeatable to do those deployments.
Well, I want to ask you about your bootstrapping your local Python environment talk.
Actually, because Carlton and I were talking about this before you came on.
But it is true that like of my three books, the third one just uses Docker because I'm like, OK, like, you know, containers, you're all set.
But that's not that doesn't work for beginners or all edge cases, even though maybe in a more professional setting, it's just like just containerize it.
And that's what I get to at the end of my talk.
So go ahead, Carlton.
Well, no, no, no, because it's it's not necessarily for me.
It's not necessarily about professional versus amateur.
it's about sort of the kind of complexity of the team and the complexity of the stack so
fairness yeah you know if it's like a small startup you know five five five people in a team
you may not have the ops chops to take on something like kubernetes whereas if you've
got a 20 person team you want to have that ops person there who's making sure that those 20
people are all doing the same thing and it depends if you're just running just running python and
you know you've got a hosted um hosted postgres and you know maybe you're running redis that extra
ops complexity doesn't necessarily pay for itself at the small scale but if you've got a bigger team
then sure it i think it does i don't think it precludes you from using containers in either
case even for small small scripts we still use docker locally to run a single python script
because you can guarantee the version of python that's going to run on a guaranteed like user
space inside that container it just narrow it keeps everything so much more sane so i even even
now and this is one of the things i got again i got to at the end of my talk at pycon last year
was that docker was really the way forward i think for especially developers who are across
a heterogeneous environment of os's versions of os's versions of python how they got installed
where they can you know you don't know where that python came from when you type python the command
line i can know i know how to guarantee i know where that python comes from when you write you
know docker run my my image so so what's your answer to how do you bootstrap your python
environment sorry i want i want i want i want him to do the talk i just i just want to make okay
well just i've because i've i've had i've had this i've had this conversation a lot with readers
you know and so one of the one of the major things i hear from from people around docker is
you know you need a nice laptop to run it a lot of you know you need a lot of ram or more you know
so it is a barrier someone you know who doesn't have a lot of money who doesn't have a nice you
know fast laptop docker sometimes is a non-issue so um well that's that's one of the reasons we
moved to kubernetes uh which sounds odd to say because of yeah explain that the reason we've
started using tools like Kubernetes is actually if you're running Docker and Docker Compose you
can run a Docker instance remotely on some other machine and actually like you know from your local
machine control that through the Docker you know control plane or even compose now the issue as a
developer is you want to be able to bind mount your source code into that that container when
it's running so you can debug or change the code and have it auto you know hot reload like Django
or the react code have it hot reload for you that's that's almost impossible to do on a remote
machine if you're just using straight up docker and docker compose because of the whole bind mount
limitation now tools like scaffold and kubernetes changes that model i can actually now actually
scaffolds faster at syncing files into a container running container than docker is at bind mounting
because of the io complexities that docker brings to the table now they fix some of these performance
issues in docker it used to be terrible uh i don't know if you've ever you know developed kind of a
few years back on docker where you're bind mounting in there and changing or had a large project that
was bind mounted into a container but the io was terribly slow but you get around that if you start
using alternate tools like for example i mentioned scaffold which can synchronize it watches for file
system changes synchronizes just those changes into the container for you and then you get the
hot reload capability and now you have the ability to actually dev locally work on local files using
your ide and all your your standard developer tools but have those synchronized into anywhere
you could have a sidecar like small little you know intel tower sitting you know desk side or
or it could be a cheap you know droplet in digital ocean someplace that is actually running you know
micro micro kates or minikube or you know you've got some lightweight single node kubernetes option
for development and now you can spin up for the couple hours you may need a day to work on that
specific project for pennies uh like all the ram you want yeah i mean related there i've recently
become quite a fan of the code spaces integration into github because i've just taken to that
pressing um you know in a repo just pressing the full stop and it convert and it just flips over
to this github.dev and a kind of vs codey type thing loads up and then you think oh actually i
need i need a an actual um i need an actual computer here so you you just go oh load this
into a code space and then it refreshes and then you're like oh do you know what in the browser
it's a bit of a pain i'm going to switch to vs code on the desktop it's like that's quite nice
um and i've been i've been a skeptic for a long time because i i um i really love local development
for all the convenience and it makes developing on an ipad a reality like the ipad pro is now
with the new especially with the new uh stage manager uh windowing the in the in the difference
between a macbook pro and an ipad pro is becoming very very slim as far as the user experience and
usability sorry if you can hear my dog in the background there yeah we're sorry we like the
dog bring the dog on more often yeah he's quite cute but i i love that portability i want to
travel with just my ipad but i still want to be able to jump in and fix a problem if i need to
although my my leadership team would tell me i should stop that and actually rely just on my
team instead of me doing those kinds of things but i still love to tinker i still want to develop
and be able to do it from an ipad wherever you know hooking up a keyboard and i've got a mouse
and now i'm in code spaces or some remote uh compute environment yeah there's way fat way
well actually the ipad is dang fast but it's so much nicer to have a cluster of machines in a
cloud someplace that i can resume into so if i was at a coffee shop pick up go back to the room or
whatever and then set back down again like nothing nothing happened like i've just i'm back in the
flow again and it gets over that limitation of the you know the the ipad os where you can't run
you can't you can't really run a python interpreter or you can't run a grass compiler or you can't run
a you know anything where no it's all on some box somewhere in the you know ether sphere yeah
i think for developer experience that's i think it's a big advancement for let's kind of roll
back to those developers who are not the bright you know shiny you know lovers they want to just
come in do their work go home i think for those developers this is a great innovation for them is
that the people who do like the bright and shiny can set up the most amazing development environment
and now share it like a rubber stamp to everybody in the whole team and now they can all share in
those same productivity tools that the you know the one person is super passionate about getting
all the power tools in their environment their dot files are thousands of lines long because they
you know just love tricking out every aspect of it that might be me uh but i can now set this up
for everybody and be like check this out like here's during a show and share show them all the
cool new few superpowers that are available to them on their command line or in their ide or in
their editor and it just goes a long way because some of them would be like oh that's cool i would
use it if it was easy to set up but those things aren't always easy to set up but they can be
shared with others easily yeah so that dev container approach that's yeah well speaking
i wanted maybe we've already covered some of it but your pycon talk about bootstrapping i noticed
your command line you've got that nice little timer thing for commands like a little hourglass
I was, cause I mean, I'm not fully tricked out, but I, you know, it's fun to do, but
I was like, Ooh, I hadn't seen that before when you were doing some commands locally.
So, yeah.
So I used, I can't remember what phase of my computer that, so I'm again, I'm a tinker
hacker.
I love fiddling my, my laptop all the time for a while.
I was using the bullet train go command line, but that got, it went, uh, kind of unmaintained
and I finally graduated into a power line 10 K or I think it's what the current one
i'm on i've been really happy with it it's but it's infinitely configurable you can spend a
couple days you know running through dot files if folks are you know curious my dot files are on my
github so you can actually just go to github.com slash carbon hp slash dot files and see like the
my configuration i use for the the powerline 10k stuff and it's it's it's so nice i mean it's nice
to have all that extra data right there available to you especially the time which more valuable
than you think because as you are running commands and doing things throughout the day
you pick oh shoot how long did that thing take to run or you know what time was i working on x
because maybe i needed to jot it down into my like time sheet or whatever i can definitely scroll
back through my buffer my terminal now and see oh that was at like 10 15 and you know i took
10 minutes to run and just provides all that additional context especially the virtual
environments bit too like knowing what the system python might be or what the virtual environment
you have activated actually is can save you a lot of mistakes i don't know how people will go with
a prompt that just has the dollar sign like mind-boggling like well that's yeah i mean you
need at a minimum you need to have the get the get stuff to let you know if something's changed
well you say minute you say minimum but how many people have you ever seen like open up a terminal
and you're like oh boy we got some work to do here like they're just they're riding pretty stock
however i try really hard not to immediately judge them right i'm feeling judged well because
no because because no sorry no but it's you know it is it's like it's like site design right it's
like simon willison's site right it either like their terminal looks like that it's like they're
either simon or they're not you know that's immediately i'm like it's an extreme here
that's what i think all right wait wait anyway so i mean so your your talk though local python
is a challenge um i thought actually the best job i've seen of laying it out i did want to
ask you though you i think you punted a little bit on like how do you recommend people install
python because you mentioned there's like i 100 recommend pyam for installing a version of python
any other python can be is untrusted and possibly incomplete uh that's been a huge pet peeve of mine
the fact that you can install Ubuntu or be using an Ubuntu container and you won't have necessarily
pip available or you won't have some of the standard library modules will not be there
because of some opinions that the OS makes. And most likely they made those opinions to protect
the OS. They don't want you randomly pip installing something into the root global
environment, which is a good thing. But it means that you really need to be installing PyInv
so you can control which specific version of python you want and i recommend installing py
env with the virtual env extension so you can just from right right from one command line create the
virtual envs activate them and now when you go into those directories or out of those directories
it automatically activates and deactivates those virtual so i'm never typing activate i'm never
typing work on i'm always just relying on you know the the dot python version file that's in my
you know path that just picks up what i should be doing when i wherever i'm at because i'm moving
from project to project a lot that may not be common for most people but it's actually more
common than you think because you're probably developing side tools that you're using as part
of your course of your work and those should be in separate projects from like the main project
you're working on and they maybe have different requirements than the main project you're working
on you may have a different version requirements that you need to pin specifically for that side
tool as opposed to the more stable project that might be running on an lts version of a library
your project i like to in your talk you mentioned or you recommended um pipx for you know global
global things like black and stuff that you want yeah for folks who don't know pipx kind of takes
a lot of that headache out of the way and also i think you get more recent versions of some of
those pieces of software so you could brew install black or i sort i think those packages may be
available in homebrew but they're not going to be as up to date as if you pip it pip installed them
from pypi because those may be the latest versions maybe the homebrew maintainer hasn't gotten to
updating the the entry in the homebrew you know package catalog but pipx gives you the best of
all those worlds where i can now pipx install black it installs and it creates a virtual
environment behind the scenes in a dot directory that you don't see but then it also puts the tool
into your path correctly so you don't think about it you just type black and it's in your path it's
going to work it's going to you know always be pinned to the right versions that are used for
that version of black and you can install you can inject other dependencies in there for example
if you're using like the markdown tools and you want to install pigments into that virtual
environment it's it's very possible it's absolutely possible to now inject
the pigments library to that version of the markdown virtual environment and use it with
your python markdown command but have a different version of pigments you may use standalone from
the command line because they're in two separate environments so the case i have there is with
sphinx because you always need extra you know plugins and things with sphinx so that so there's
one environment that's got all the sphinx bits in and they can just be updated yeah yeah yeah and
I've been I've been checking in I've been using um shamewa for my dot file uh configuration
management so that that tool allows me to actually have a shamewa init bootstrapped command so
basically a text file design defines on the very first run of this dot file manager run these
commands so it basically goes installs homebrew installs pipx installs all my standard utilities
so that every command I expect to use now will be available and that's a big complaint a lot of
had about customizing their dot files and customizing your shell is you log into some random
shell to debug something and half your stuff doesn't work so your muscle memory is all broken
on you know your keyboard you know memory is is missing but for me i log in i literally do shame
one knit calvin hp because it comes from my github it knows to look in github to get the dot files to
run that first bootstrap command and maybe it takes 30 seconds to a minute but everything now
it just works yeah that's that that not having a tool when you log into a strange box that's kind
of why i keep my my setup quite vanilla is that carlton there's no reason check out there is i'm
old that's speaking of speaking of being old carlton can we tease your pycon italia keynotes
because it's out there yeah yeah yeah you can so i've um i was invited to a keynote at pycon
italian i'm doing a keynote called um open source for the long haul which is you know
what about what it says um and but the the kind of the the relevance of being old is it's kind of
the um the the counterpart to the growing old gracefully talk i did five years ago when i
in heidelberg when i started fellowing and now i'm finishing fellowing and i'm doing a kind of talk
okay so and and yeah this is the the reflections it's not explicitly a reflection on that but as
i'm writing it it's a reflection on okay that's what i said five years ago what would i say now
so i'm quite excited about that i have to go watch that i'm quite interested i i've followed your
journey from the beginning as a fellow so curious to see what the the wrap-up is the post-mortem
you're not dead luckily yeah no no no no no no but you i mean part of it is you've got to be
able to step away before you become you know so um well segue to conferences python web conf is
coming up yeah in um a couple weeks march right yeah march uh 17th through the 23rd i think that's
yeah we'll have we'll have a link to that right i'll be in trouble but uh
Okay, we'll put a link. But this is, you know, speaking of watching things grow over time, I mean, this is really up and to the right.
Fifth year.
Fifth year, but I mean, the speakers and the number of people and everything is really impressive. I saw actually you did some videos. You did one with Al Swigart, but like, you know, even teasing out, you know, guests now. So it's really impressive to see.
There's more of those coming. I think tomorrow I'm actually interviewing Anthony Shaw, who will be one of our speakers from Microsoft. We've got two speakers from Australia who are joining us. I mean, that's the dedication for the excitement level that the speakers have about this conference is that they're willing to come at like three in the morning their time and give their talk live to the crowd. But it has grown a lot over the past five years.
We started this in 2019 as a virtual conference before it was hip to be virtual, and it will always be a virtual conference.
And one of the things I always love to tell people about this conference is it is really meant to be accessible to those who may not be able to make it to a PyCon or a regional conference of some kind.
We want to make sure that fills that gap.
And the other part is I want to make sure it fills in where I feel like some other conferences may be lighter on the intermediate content.
I think there's definitely a space for like the PyCons and the regional conferences to handle a lot of those newer intro talks to get people spun up into the Python community in a nice, gentle way.
We're really meant to bring some heavy hitting talks, not that I would dissuade you from coming if you don't know anything about Python, but you're going to be in a more advanced group of speakers and guests and people who are attending the conference.
It's really a lot of very professional folks who are doing a lot of cool things with Python.
I had a kind of intro talk with or a conversation with one of our speakers yesterday.
They're talking about the state of fusion.
So those of you who are kind of following the fusion story right now that we finally had in the last couple of weeks,
the first time that more energy was produced by fusion than to make the fusion process happen,
we have a speaker who's actually going to talk about where Python's playing a role in that science.
so i'm very excited they're they're super passionate about you know obviously this
topic i know very little about fusion so i'm looking forward to hearing about this
this wasn't the speed ups in python 3.11 no no no no no i don't think they use it in that
level where the performance actually matters they've got other things they're doing but
when it comes to data and accessibility and getting people onboarded to be able to process
all the data and actually come up with new insights into what's going on in the science
because it just provides people with so many
superpowers. I mean the batteries included
and then you layer on top of that like pandas
and matplotlib and all these
great libraries that just give people
superpowers without having to do much
work to kind of get to that
point of having all those things at your fingertips
is really a powerful story
for Python.
So that's what you'll come to
expect to see at Python WebConf this year.
We're going to be five days.
Half days East Coast time. We will have
a keynote speaker kick off every day and a
keynote speaker to end every day i really love hearing inspiring speakers so even though we are
a virtual conference we are having 10 keynote speakers as part of the the the standard program
and that will never change because i love having just big thinkers get up on the in our stage and
actually talk about the things that are going on in their lives and and those talks range from
technical to i've had some speakers who had barely ever used zoom before who are non-technical give
some amazingly moving talks.
So if you look last year's talks are all online.
There's a talk up there by Noel Musica
talking about the flower.
It'll bring you to tears,
but it'll move you to action
and whatever that passion may be inside of you.
And that's what I want to draw to people.
And mine just happens to be Python in the community.
And that's just where I get a lot of energy.
So I want everyone to join me for Python Web Conf.
Super.
I know Carlton, you know,
with all your post-fellow time.
Yeah.
2024, Carlton will be everywhere.
Oh, I don't know.
We were just talking about this.
Yeah.
Actually, you might appreciate this, Calvin.
I was sharing a quote about, you know, marketing, right?
Tech people famously don't like to do marketing.
And it was saying that, you know, it seems self-involved to market, but it's actually more self-involved to think just building it, people will come to it.
Because why are you so important that people care?
It's so true, though.
It's so many times people think that, you know, if you'll build it, they'll come.
How are they going to know about it?
Who are you that's going to bring them to it?
Right.
I mean, I, and I try to think about this like, like movies, right?
Like Tom Cruise, right?
He spends, I don't know, six months making a movie.
He'll spend a year marketing it like Tom Cruise, right?
Like it doesn't matter how big you are because they know what's up, right?
Like if you, but you know, the hard thing is you think all the time is spent on the
making and really it's that long tail of, of, uh, of marketing.
So anyways, I thought that was good. It's like, yeah, because it does feel a little icky to self-promote, but it's like, but if you don't self-promote, like, how does anyone know anything about you?
so yeah that's definitely been a problem in the open source community i think there's been a lot
of resistance to people making money um you know they feel like they should all be doing it for
just the good of the projects but the these projects like django wouldn't be around if there
wasn't money to fund the initial seed of the project or the the fellows to you know make sure
they can kind of steward the project into better directions or the the companies who are submitting
back the pull requests for bug fixes and security fixes because they needed them there's still money
involved and it's not a bad thing we shouldn't be upset about talking about that and we shouldn't
be upset about marketing yourselves as awesome developers out there but like eric mathis was um
he's putting out some more content he was like he said on um mastodon or fostered on whatever we
call it um i'm a bit shy about saying it's like no eric we want to know we really like this is
this is why we're here he does he does have a sub stack now i think it's mostly python so
right but even even you know even the the top python book author um yeah feel feels that way
so there's a lot of good people to emulate our model out there i mean simon is excellent i mean
at sharing every idea and thought he's having in his head and i love it i want to hear more
about the things he's doing and he shared all the time his his talk at um that he gave at jango con
this year um was part of it it was only one small part of it but was that the every commit needs to
have like the documentation with it and every project needs to have the blog post that goes
with it and read me front page the whole entry into the project is really important and if
and if you're not doing that bit it's not done right was his kind of line and i i did really
like that it's like yeah okay so i've got to write the docs here and i've got to actually tell people
about it well um jeff bezos amazon's famous for his executives like you kind of like write the
press release first and then you figure out how to do a new project so just that idea of like um
yeah working backwards right as opposed to like let's just jump in and figure it out it's like
you know of course which is what we all kind of do but it is helpful to be like and how are you
going to describe it to someone you know it's like well it's complicated it's all it's like
and they're just like zoning out immediately you know versus you know i think it is good
guardrails for yourself like what you know as we see like you know squirrel squirrel squirrel of
like interesting things we want to chase like what what what's the guardrails that we impose
on ourself for a project anyways it seems like there's been multiple attempts at like document
driven development um but they just never really caught uh there's it's that's a hard leap to make
to go directly from well i'm going to write the docs first and then i'll end up in tests and then
then i'll end up in the actual code that does it yeah doc test driven development yeah that's pretty
black belt level to be able to do that i've never i've never done it so
i mean well so i i occasionally will um write the test first but normally it's when i'm um
i've got like a complicated algorithm or complicated process and i've got if i just
wrote that little bit then that would be one step in it okay so i'll write a test for that
one little bit okay and then i can write the bit and then okay i'll write a test for that bit and
then and in that circumstance i'm writing tests i'm doing test driven development but very quite
often it's the other way around it's like that blitz out some code and now i'm going to write
some tests the other end i'm saying the docs normally the docs are right now i better write
some docs yeah yeah well calvin i wanted to ask you i was carlton i were also talking about um
testing in that i got a reader asked me a really interesting question which was
like she's is there a canonical encyclopedic guide to django tests um which i don't think i just i
wonder like internally do you have something like that because it like what is the point where it's
like okay like yeah you test your models views templates urls but like for a beginner it's hard
to know like what's core django what's new um you know where is that line where things become
custom um and i've you know i don't i have some testing tutorials but i don't have a great you
know pie chart of like here's where you need to make sure you have you have coverage other than
you know running coverage so i just curious internally how do you think about that because
i'm like yeah i would love if that existed and we could all agree you know like here's here's the
80 20 right other than like when there's a new page you should test you know test all the pieces
of it but um yeah right now that's really driven internally by the developers you know love of
testing it's going to determine how good the output may be um luckily on our side we've got
some great new developers who've joined the team recently that are very excited by you know test
driven development and so we do see more test coverage in our projects um and prior to that
it was just you know it's all new or people weren't excited about it but i think that has
to be a culture like you have to start bringing in folks who are who are excited they'll give the
demos they'll you'll see the test you know coverage creep in or start happening and with
ci pipelines when you have a failure and you get a nice little email saying you you broke the build
that that obviously is a nice peer peer pressure driven development that encourages you to go
fix and or add you know test coverage because maybe you've got a rule that the coverage has
to be x or that your commit can't go through do you have the rule that whoever broke the build
last is responsible for running it maintaining it we do i mean that but i keep seeing sometimes
broken builds stay for a couple days i'm like come on what's going on here let's get back in line
have to go have a conversation a couple people i think if i was the beginning thing targeting for
beginners i'd say like have a test at least on the sort of main functionality of every bit like
you know check that your home page loads or check that the login button actually submits or or
something and then if you run that with cai then if you break it you know that you broke it and
you're like okay and the other thing the other you know 100 coverage or whether that may be a
bit too much to add for but if you've got that kind of basic coverage over the core bits it's
like a smoke test that oh actually i broke something i think there's value in writing
those what seem like silly silly tests like you're like oh this is obvious this should always
obviously work well if it should always obviously work you might as well write the test for it so
but then when it obviously doesn't work, you'll know.
Yeah, yeah, yeah.
Yeah.
I, it's hard when someone, you know, again,
this is what I spend my time with someone who's so new,
like someone who was saying,
okay, because I have tests in all my books, but I can't write an entire book on Python while I'm
writing a book on Django. And they were like, okay, we're using setup test data. It's like,
what's this class method? And it's like, okay, well, what are classes? And then why am I using
CLS? And it's like, oh, PEP8. So it's this tension that I enjoy this challenge of trying to be like,
I want to explain things. I don't want to have magic everywhere, but how deep do you want to go?
sometimes just like, just trust me on this. This is a best practice. And like, you know,
here's some links if you want to knock yourself out, but it is, it is completely overwhelming
because you're just taking it on faith. It's like, oh, I copied these tests and it seems to work.
And I, I kind of understand why, but like, you should just put a little footnote, go see Brian
Hawkins book if you want to like dive in here, you know? Yeah, yeah, yeah, no, exactly. But,
you know, there is a tension there between the kind of person who wants to understand everything,
but also it's just turtles all the way down and so some i see that these people get really hung
up right and they have and sometimes it leads to like i'm just never going to grasp all this
and it's like you know it is fractal like you do need to put limits on stuff and just be like you
know have a list of like i'm going to go learn those things um but you know at the same time
like if someone was you know someone could read the python docs and like read about class methods
but there's no way they're going to remember that until they need to use it so it is kind of like
you kind of need to use it to learn it and then um anyways i have empathy for these things and
even for me too i'm kind of i'm like oh like oh yeah i'm like i know we use that like why do we
use that like pep 8 i was i was like i was like why the heck do we do this do it this way it's
like oh yeah it's a pep somewhere we have that same attention even internally on our team it's
like we're a team of senior developers who are continually wanting to learn new things but
there's no point in going and learning some new technology piece of technology if you're not going
to go apply it right away yeah because you'll just have to give six months when that project
does come around where that technology is actually warranted you're gonna have to go brush up on it
again anyway um and so a lot of our learning is really on the job hands-on when a project comes
about um you kind of we track and like a kind of a skills matrix or inventory of what people
obviously what they're good at but what they want to learn so that we can actually match them up
with some projects where they would want to actually expand on the new skill set that's
really wise because i've seen so many times a developer in a job who's a little bit bored
start going experimenting with some new tech because they're a little bit bored and the way
around that of course is to preempt it and know that they want to use that and assign them to the
the task so that's i think that's that's obviously a battle one learning yeah it's a free tip for the
day yeah well i think also um bugs like this is maybe less as a consultant but more when you're
internally working on something you just have this big long list of known bugs and figuring out how
do you how do you deal with that because you know you can't just fix bugs all day and have perfect
code because you're never going to ship new features but on the other hand if you never
fix any bugs developers are all gonna eventually leave and get frustrated so um i've said this
before but i think one really nice way is to have like bug squashing days you know every month every
whatever so that it puts the onus on the developers so it's like okay there's 100 bugs like you take
ownership for deciding what are the 10 you're going to handle and it's a way to kind of off
um you know it's like a escape valve for like yes we are we are fixing these things but not right
now because we're doing something else but it's not you know an indefinite like code is all crap
If you look on our site, over the years, we've done an activity internally called FedEx Day.
It's now called Ship a Day.
And actually, it's kind of more recently morphed into what we're calling Tech Debt Relief Week, where at least like the first week of January, everyone's kind of coming back from the new year and they've been off for like a week or two.
And we'll be like, we're not going to do any customer work this week.
We're just going to focus on tech debt relief or new process enhancement.
And so it just kind of came about that that was a great way to give people that outlet to clean up the little messes that have been made.
And sometimes we do it more than once a year, but usually once a year, we spend a few days not on billable work, but actually fixing up and scratching those itches and coming up with new innovations and ways we can improve our workflow.
It feels so good as a developer.
I mean, it's hard to describe to someone who doesn't do this kind of thing, but it's like a cleanliness thing.
right like oh there's nothing worse i mean for me picking up an old code base and we thought we
were doing best practices at the time and this is maybe a couple years back and looking at it and
going oh my gosh i got so much to clean up here first before i can even be productive like my
it's like having a clean desk before kind of getting started on some creative work
it's hard when you like you just you just see little like issues here and there and there and
there it distracts you from solving the actual problem yeah at least that's for me i assume most
folks were like that one one analogy i've seen for this kind of environment you know this kind
of discussion about tech debt versus shipping features is that you know okay if you just
should work on shipping features in the short run you'll ship more features but sort of velocity
falls over time and you know in the medium to long term you're actually going slower because
you didn't prioritize would you agree with that i would agree with that it's like the drag the
friction coefficient yeah or yeah well obviously in the beginning of a new product there's a lot
more new features to build too because there's nothing if nothing exists i mean we're we're
experiencing that with the loud swarm platform that we host python web conference on is we
actually pick it up in a couple times a year if folks are not busy on another project they
will get assigned to do some cool new stuff on loud swarm which is actually really fun
that's been a great platform for us to experiment with new technologies like a lot of the web sockets
work we've been doing a lot of the kubernetes you know advancements in our development developer
experience have all come out directly out of us innovating on the lads warm platform that's cool
i love that's all because you've hosted the jango cons for you know the whole of the copy period and
since as well but um i remember last time you came on you were talking about fargate and containers
on fargate and now you're talking about kubernetes so i see that we still use you can still use
fargate with kubernetes if you're using eks on amazon and you don't feel like you don't know
your capacity is going to be or you want to scale closer to zero it's possible actually to spin up
compute in fargate fargate works on ecs and eks uh just equally as well so you can actually define
your pods and your your resources to pull from fargate ad hoc serverless resources sorry a little
plug there for amazon fargate but i do find fargate to be a great option when you just aren't
sure how much resources you need or you don't want to be maintaining the ec2 instances that would be
hosting those resources yeah no perfect um yeah so many pieces so many moving parts to have a
playground to work on it and it helps a lot uh and it brings people up to speed on like cic pipelines
and terraform and configuring cloud front and all the like caching pipeline and you know how do you
split off web sockets into separate like pods of like resources that are servicing just those
kinds of requests like there was a point came up last week or a couple weeks ago where people
asking how did the front end get deployed like the ci pipeline would run and it would deploy
new fargate containers for the back end and there would magically be a new s3 bucket synchronized
with the latest front-end changes and people were like we don't know where how those get there
and so we had to go track it down and it's actually kind of clever is that we actually
had fargate tasks that when the fargate release would happen obviously new task definitions would
be generated fargate picks those up and then replaces the old containers for django but at
the same time it runs the django migrations it does the collect static and then it synchronizes
as a task that then runs and stops all the front end to the the front end of the s3 bucket so it
means anytime you release any release you're sure the front end always matches the back end
because they were tagged at the same time yeah lovely that's nice lovely um coming up on time
um our previous guest was um marius felicia we were talking about django 4.2 and 5.0 um so my
question to you is are there new features you're looking forward to and like what's your current
django wish list if you just you know make it so what what could django add that would improve
uh things i mean i think the the async story uh is is ever improving and i think that's really
important to django being a viable player out there still i think a lot of people may see start
to see django is getting along in the tooth and being a still a synchronous web framework and
obviously it's hard to kind of bolt that on and we've had a we've taken a very measured approach
to getting async capabilities into django and i think that's that's a high priority i think that
to make sure that Django and Python as a framework,
Python as a language
for people who are doing work in the web,
it has to be async.
So 4.2 brings in async support for streaming response,
which is if you pass it in async iterator
and you're running under ASCII,
you can do long polling or service end events
just with core Django.
I think that's quite exciting.
then with the psycho pg3 um i was talking to marish about this today i want he's like i'm like
you've got to put this example together for me marish where we use psycho pg3's async client to
listen to a notification that came either from a trigger or you know a save event or whatever
and then tell the listening client from a streaming response it needs to fetch the lady
so that kind of real-time updates all in native django and then so that giving streaming responses
um making a streaming response is async compatible that's kind of like the outer limit i think in
terms of features that will be in django core but now through 5.0 it's like okay can we flesh out
the decorator support for async can we you know just get async signals can we you know flesh out
some more stuff inside the orm can we you know these these richness of the async support now so
over the next over the 5.x cycle it should mature you know continue to mature and all those all
those gnarly bits will start to smooth out and give you that django experience that you want right
um for people to pick up and actually be productive they need that django batteries included
you know pythonic way of of working yeah i think i can't remember who we were talking it might be
michael kennedy who's saying that um uh you know python is a full stack language and you know it's
important for it to have async because you don't need you don't want to have to just jump to rust
or Go or whatever, just because you need high throughput.
It's the same with Django.
There'll always be specialist async frameworks
that are more advanced,
but you don't want to have to change your web framework
just because you need a streaming view
or just because you need that.
So it's coming on nicely, I think.
Well, and that's some of your post-fellow open source work
is around async as well.
Yeah, so I'm not leaving Django
by any stretch of the imagination.
It's given me a career and I've given back to it for this period.
But now I'm going to step back to the back benches.
But I'll continue maintaining the channels projects and have a bit more time for them in actual fact.
And I want to continue working on the async stuff in Django itself.
And then, you know, there's things around the request object in terms of modernizing that and bringing in JSON request passing that I want to add.
and then the really sort of next one is is to improve is to refresh django serialization
story so this is something that i want to work on and um well andrew godwin just released his
little api framework that's very fast api inspired that was in his um what's it called uh the
the mastodon client that's called something the name escapes me but he's released this hatchway
project as a yeah um proof of concept there and it's it's quite exciting because it takes
pydantic schemas and pulls them pulls the request straight into the request the request handler and
then it will serialize back out for using pydantic again that's that's where the future's got to be
that's where it's like okay whether it's pydantic or whether it's attas or whether it's you know
something else that giving django a better story or a fresh story there is something that i'm really
excited about over the 5.x um cycle speaking of airflow we should ask andrew to come on yes he
works he works on airflow um yeah we'll have to get it back on you should i agree well any is
there anything um we we didn't cover calvin i know no i think make sure you come to uh python
web conference you guys have there's a um a discount code that you all can share in the the
the show notes for everybody brilliant and i'll get you like 15 yeah come hang out with me for a
week it's a lot of fun yeah yeah but you don't have to watch all the videos live right you can
you can catch you can buffer them oh so they're available immediately after the talks uh which is
a beautiful part of like the loud swarm platform is that you can watch any video that's happened
already uh like within 10 minutes of the talk you can also pause and go get a drink and come back
and unpause it or come in rewind and be like yeah you can totally do that that's that's one of the
things i love i mean i didn't want to create this platform that just matched reality i wanted to
you know emphasize the features of being a fully virtual platform and one of those is definitely
like i missed the first five minutes to talk i'll just slide the slider back and watch the talk live
from five minutes ago yeah well i just yesterday actually i was doing something trying to explain
async to someone, which is rough because I don't fully understand async. And I was looking at
Carlton's talk again from DjangoCon, which I've seen twice. And I was like, oh, God. So I was
definitely rewinding and sliding. So it's very necessary to have that. Yeah, the struggle is
very real. Well, Calvin, thank you so much for coming on. We'll have links to everything.
Python WebConf, we'll have the discount codes. Really, I like this. We see you at DjangoCons
in between but to catch up on what's happened last year is really interesting yeah it's always
a great conversation thanks for having me thanks for coming on again coming so jango chat.com we
are on uh fostered on and we'll see everyone next time bye-bye bye-bye