Transcript: Coverage.py - Ned Batchelder

00:00.0

Hello, and welcome to another episode of Django Chat, a podcast on the Django Web Framework.

00:10.5

I'm Will Vincent, joined by Carlton Gibson. Hello, Carlton.

00:13.3

Hello, Will.

00:14.3

Hi, Carlton. This week, we're joined by Ned Batchelder. So, Ned, welcome to the show.

00:20.0

So, we're going to talk a lot about testing, but also your career in Python and Django

00:24.6

coverage and all sorts of things. So, I'm really happy you could come on.

00:28.3

Thanks.

00:28.8

Someone we've wanted to have on for a long time now.

00:31.0

So thank you for making the time.

00:32.4

Sure.

00:32.8

I think we've been exchanging emails about this possibility for a while.

00:36.6

I think from the beginning.

00:37.5

I mean, the short list I had when we started this was you were on it.

00:43.7

Nice.

00:44.3

Wow.

00:44.6

I'm honored.

00:45.4

I'm honored to be early on the list and late on the recording.

00:49.3

You'll notice how practiced we are now.

00:51.7

All of that.

00:52.6

That's right.

00:53.5

It's very smooth.

00:55.0

Yes.

00:56.3

So you are, as a brief intro, so you don't have to say it yourself, you're a pillar of the Boston

01:01.8

Python community. You run coverage.py. You've given lots of talks at PyCons here. You've been

01:09.3

most recently at edX. I think everyone, most people in the Boston Python community know who

01:14.5

you are, and many of the people at PyCons know who you are. But for those listening who perhaps

01:19.8

don't, how do you describe yourself these days in terms of your background in the community?

01:25.7

Yeah, that's a good question. I often say I'm deeply embedded in the Python community.

01:31.0

You touched on Boston Python. So I'm here in Boston, Will, as you are. Or at least last I

01:37.8

heard, who knows what has happened in the 18 months of the lockdown. People seem to

01:42.2

move all over the place. But yeah, so I've been organizing the Boston Python user group for

01:49.5

over a decade I guess now. You mentioned PyCon. I have done a lot of PyCon talks but with the

01:57.4

pandemic and a few missed PyCons it's actually been a while since I've been to a PyCon and I

02:03.5

haven't done any of the virtual PyCons so I need to get back into that. But I have been maintaining

02:09.3

coverage.py for a very long time and as well as some other much less interesting side projects

02:15.8

and i work at edX as a day job which is open source project built on Django and i work on

02:22.3

the open source team at edX so i my whole life is open source python pretty much so you um i think

02:31.3

we to be more specific we both live in Brookline because i saw your your mapping project um and i

02:36.6

i was i thought i saw you and your son like two years ago walking along the riverway um but i

02:43.9

didn't want to interrupt you and um you have this very cool mapping project i was looking at on your

02:49.4

site which we'll have a link to um and so i think i have a pretty good idea where you live but i live

02:54.1

in brookline as well um that's right so i've been during the pandemic my exercise of choice has been

02:59.7

walking because it keeps me apart from people and i can do it from home and my son was at home with

03:05.2

me and he needed to walk too and to keep it interesting i've been trying to walk on new

03:09.3

streets every time and i've done now i think 302 walks from the house to the house and only two of

03:15.2

them have missed getting any new streets so it's been a fun nerdy way to get exercise so this is

03:22.2

this is reminiscent of kant's seven bridges of coningsburg problem yes but the topology is much

03:29.2

much worse and and we don't have time to get into all of the nerdy details that go through my head

03:36.3

when planning walks and kicking myself for having missed a street I could have walked on and all of

03:41.4

that stuff. Right. Okay. And you put it, I saw an image go past this week, I think. I didn't know

03:46.1

about this project. And then you've blogged about it on your site. That's right. There've been two

03:49.5

blog posts, well, three blog posts technically about it so far. But yeah, when I completed my

03:53.3

300th walk, I did another kind of visualization about how hard it is to get to new streets on

03:59.4

walks 201 through 300, because you have to walk over all of the old streets you've already walked

04:04.0

on to get to the new streets and etc etc and do you end up going ever further from home because

04:10.3

well yes both because i'm getting more fit so i can do a six and a half mile walk and not collapse

04:17.9

but also if i want to get to a new street i have to walk farther because i've already walked on

04:23.5

all the streets within the five mile radius so or the two and a half mile radius i guess um so yeah

04:30.1

It's been a lot of fun.

04:31.7

You mentioned it keeps you away from people.

04:33.6

Would you say that's part of a programmer personality type thing there?

04:38.3

Yeah, it sits well with my personality.

04:41.6

I don't mind the social distancing.

04:43.6

Well, that's a beautiful thing to do during COVID.

04:46.2

I'm jealous.

04:47.3

I was stuck here with an infant the whole time.

04:50.7

That's right.

04:51.2

And I've walked on streets during these walks that are within 50 yards of my house that I never had a reason to visit before.

04:59.4

So it is eye-opening.

05:01.6

Well, that's great.

05:03.8

So Python and Django, maybe I could just ask, so you have a long history with Django.

05:09.1

I think I've seen you say back to 2006.

05:12.5

I'm curious if you could talk about that and how you've seen the project develop since

05:16.1

that's pretty early.

05:18.4

That is early.

05:19.0

So actually, my earliest touchpoint with Django was actually part of Boston Python.

05:24.1

So in, I think, November of 2005, Boston Python, as one of their events, said, hey,

05:29.4

let's do sort of a shootout of some web frameworks and different people

05:33.5

volunteered for different frameworks.

05:35.2

I actually got turbo gears as the thing to try out and someone else got this

05:39.8

other thing, that new thing called Django to try out.

05:44.1

And we built some app,

05:46.2

we chose an app and we each built the same app and compared.

05:50.3

But shortly after that, in January of 2006,

05:52.7

I actually got a new job working for a startup called Tableau, T-A-B-B-L-O,

05:57.5

not the database tool.

05:58.6

Not the billion-dollar one.

06:00.1

Not the billion-dollar database tool,

06:02.3

the photo-sharing and storytelling website

06:04.6

that was eventually purchased

06:05.8

and then disappeared by Hewlett-Packard.

06:11.0

But that was all built in Django and Python.

06:14.0

And the reason I got the job

06:15.8

is that the founder of the company

06:17.4

had asked around in Boston,

06:20.3

like, how can I find some Python expertise?

06:22.2

And that eventually led him to me,

06:23.6

and we had a coffee, and the rest is history.

06:27.0

And that was on Django 0.91 in 2006.

06:33.1

And actually, my first week on the job, I had given three weeks notice at the old job

06:39.3

instead of two weeks.

06:40.4

And so through bad planning, my first week on the job was a week that my boss and the

06:45.5

founder of the startup was going to be on the other side of the country for a business

06:48.6

trip.

06:48.9

But Monday morning, I showed up in the office and he said, can you build some ACL stuff?

06:54.6

We need some access controls on this app we're building.

06:57.0

see you later and i spent the week digging into django and hacking around and building acls and

07:03.7

he showed up like friday afternoon and i showed him what i was built and you know we were both

07:07.2

happy with each other from then on so it was kind of a trial by fire but it worked out but that's

07:13.1

like the dream first week no yeah that's the perfect do some programming we're not going to

07:17.4

talk to you in some ways that yeah it was great i mean and it was the size the size of the startup

07:21.7

was I think three engineers including me and the founder and so there was just a lot of heads down

07:27.6

time just hacking on stuff there was not a lot of overhead or or difficulty um organizationally so

07:34.1

that was great it was very early on um I remember we had one problem with our home page that it was

07:40.4

it was doing a 9009 queries to build a home page or something and the reason was that if you if you

07:48.3

built, if you got a related object and then asked the related object for the parent, it would do a

07:52.9

new query to get the parent, even though the parent is how you got to the object in the first

07:57.0

place. And so I submitted a patch to Django that would just hold on to that object for later so

08:02.3

that we only needed a hundred queries or something. Like it was just one of those early on in the ORM,

08:08.1

you know, not, there hadn't been that much optimization done. So there was a lot of low

08:11.2

hanging fruit. And so early days you could, you could dig into those kinds of things and make

08:15.6

those kind of patches the great thing about Django now in some ways is there isn't that low

08:20.6

hanging fruit because it's so mature but in those days there was still a lot of work that needed to

08:25.9

be done to bring it up to speed yeah no select related or any of those you know conveniences for

08:31.6

I don't even remember if select related was an option at the time yeah I mean no class-based

08:35.7

views either you had to roll your own functions but listen so from my point of view class-based

08:41.2

views are still sort of mysterious and new and confusing and almost a poster child for what's

08:48.5

wrong with multiple inheritance but yeah definitely we don't have to get into that i'm sure if i

08:53.2

know no this is a podcast on jango like the whole point is so we're going to touch on this theme a

09:01.1

bunch i imagine but i tend to start do a thing once and then keep doing that thing for a very

09:06.8

long time and and not consider new things and the difference between function-based views and

09:12.8

class-based views fits neatly into that into that narrative i know how to write a function view as a

09:18.9

view and class-based views to me still seem weird so you're a bit like so you you fire up a blank

09:26.0

you know a blank pi file views.pi and you you need to write a view you're going to write a

09:30.6

function-based view that's that's my go-to yeah and i mean so since those early days of like

09:35.9

writing, spending a week writing access controls, I haven't done that much hardcore direct Django

09:41.3

development. So in some ways it's, you know, it's really moved on without me. And I've,

09:49.3

that's one of the challenges, you know, being a long time adopter of technology can sometimes

09:55.1

mean that you're an expert in the way that technology was 15 years ago. And you're stuck

10:01.0

because you've just been spending your time doing something else rather than keeping up with all the

10:05.3

amazing work that jango's been putting out so like i mean don't even talk to me about async

10:10.6

will was asking me earlier on about something how do i organize a i know where do i put my

10:17.7

settings files or something and i described a little file system layout to him and he's like

10:21.3

oh old school yeah probably yeah i know it's hard i mean people call me old school now but

10:26.8

yeah you were describing a you know dash running manage.py with dash dash settings and i said you

10:31.9

know you don't because you used to and you can still do it but you used to have multiple settings

10:37.5

files you'd have like a base and local production test and i would say these days you can use

10:42.8

environment variables so you have one file and then you swap in with the environment variables

10:47.5

but still works for carlton so you know all the time i still you know i still specify my settings

10:54.5

on the command line because like who knows where they might be otherwise you know even if it's one

10:59.3

far it's still i think what saves me is i'm constantly teaching beginners so i feel like i'm

11:03.5

obligated to find out if there's a better way whereas for my own stuff um i mean it's painful

11:10.1

it's like what's the point like i know what i want to build i just want to build it you know

11:13.0

it's sort of like installing python if you like you know we have our everyone has a different way

11:17.8

to do it unless you though actually i do want to say though like with running boston python

11:22.4

unusually you get a lot of exposure to a wide spectrum i mean both you mean because you have

11:28.6

talks and there's project nights, you get total beginners and then you get postdocs at science

11:36.0

places. So I want to just broadly ask you about that because you probably see more of a spectrum

11:41.2

than I do, whereas most people are in their little lane with work, what's in front of them.

11:46.9

Yeah, it's interesting. I mean, so you mentioned teaching beginners. The great thing about teaching

11:50.7

beginners is you don't have to bring them up to speed from the way things were in 2006, right?

11:56.3

You can, you can just pretend that the timeline starts now, like now we do it like this and,

12:02.3

and the old stuff doesn't matter, right?

12:05.0

Carlton and I are, are burdened with all that luggage.

12:08.3

So we, we will probably carry some of that around and, and we may touch on another recent

12:14.0

thing, which was me redoing my website, my personal website as an actual Django project,

12:18.7

which, you know, full disclosure, I think I have seven or eight different settings files

12:23.4

with different names of like base

12:25.3

and that server name and et cetera, et cetera.

12:27.4

And I know there's a better way to do it,

12:29.0

but I also know that if I go and look

12:31.0

for the better way to do it,

12:31.9

I'm going to be presented, you know,

12:33.1

in classic Python style

12:34.4

with a dozen different options

12:36.1

hacked together by a dozen different people,

12:38.2

each of whom thinks their way is the best.

12:40.5

And it's hard to sort it all out

12:42.5

and et cetera, et cetera.

12:44.4

Again, one of the nice things about Django

12:46.8

is the out of the box,

12:48.6

like Django will tell you how to do it

12:50.7

kind of mentality,

12:52.1

which is a rare Island in the chaos of the Python world,

12:56.7

where we love the fact that people can hack together something and get it out

13:01.3

there and people will start using it.

13:02.7

But that makes for a lot of competition among all these solutions that can be

13:06.1

really overwhelming for, for beginners or not even beginners,

13:09.4

but people for whom it's not their, their passion. Like I don't,

13:13.0

I don't want to make settings files. I want to make a website.

13:15.3

So please just tell me how to make my settings file so I can get on with the

13:18.6

thing I'm trying to do.

13:20.0

Right. But it's not JavaScript, right? I mean,

13:21.7

because I feel that, but then whenever I, it's not JavaScript, I'm always like, wow,

13:25.8

compared to JavaScript, Python is gloriously. Yeah. The problem with JavaScript is that they

13:31.9

don't, they don't, and I shouldn't talk trash about JavaScript, but my impression of the JavaScript

13:37.2

world is that every, every time I dip in there, which is about once a year, there seems to be

13:41.7

a new way to do things that is the right way to do things. So they haven't, they haven't embraced

13:48.2

the idea that there are many right ways to do things. They've sort of embraced the idea that

13:52.4

there is always a right way to do things, but it's different every single year. And, and that's just

13:57.3

as, that's almost more difficult to keep up with because now it's not just like, oh, I chose A and

14:03.1

you chose B. It's, I'm doing it right. And you're doing it wrong because you're still doing it last

14:07.2

year's way. That's the way it feels to me, at least. Maybe if I were more embedded in the

14:12.7

javascript world it would it wouldn't feel quite so foreign to me but you you asked me about boston

14:18.3

python and the beginners and it is true boston python has a wide range of people attending

14:23.6

you mentioned talks and project nights we've actually slowed down quite a bit during the

14:28.1

lockdown we haven't had a presentation night in a while one of the interesting things about running

14:33.8

a geographically based meetup during a pandemic is that all everything goes virtual and then all

14:39.7

sorts of people who live nowhere near you start showing up to your events and that's okay i guess

14:45.3

i'm not i'm not sure like what's the word boston mean in boston python if everything's on zoom now

14:50.0

i don't know um san francisco one too has had a number i mean i think that's the other large one

14:55.7

that i'm aware of um and they've had a couple new york is very big i guess you're not as active

15:00.1

right exactly yeah that's right i think they put a lot of their energy into pygotham which is their

15:05.3

annual um you know pycon like thing i think that's right um so yeah boston python we do get a lot of

15:12.1

beginners it's interesting i we don't we don't get that much discussion about from sort of django

15:17.5

beginners we get a lot of python beginners yeah and a lot of data beginners yeah well there is

15:23.5

that i'm not sure there is a django group which isn't as large um but i i mean i agree i for a

15:31.9

a while i went i tried to go to a number of the um boston python ones and almost always saw you there

15:38.1

which is a good i was like man i don't know if i should go all the time and yet you found the time

15:42.7

um but it's a lot of i mean a lot of grad students or it's you know or people getting a graduate

15:47.2

degree in whatever and realize they want to you know script something and so python so yeah not

15:52.8

much of a web focus um exactly right um and we've been doing the thing we've been doing in boston

15:58.9

Python during the lockdown, the new thing we've been doing is weekly office hours. So Mondays at

16:04.3

noon, we just get on Zoom for an hour. And that's one of those things that we couldn't have done in

16:10.6

person because who's going to travel for an hour long thing at lunchtime? But it's easy to do

16:17.8

weekly and there's no prep for it. You just show up and it's like people just start talking about

16:22.2

whatever and maybe someone knows the answer or wants to talk about it and you talk. So that's

16:26.5

been very easy you all are saints to do that i feel maybe it's just because i get a lot of emails

16:30.2

from people which i do respond to everyone eventually but i feel like i couldn't do that

16:36.3

it would drive me nuts but i wish i could see that on the flip side i i do get emails from

16:42.2

people and i have every intention of replying to them and then they're six months old and i just

16:47.8

have to declare bankruptcy well you know it's it's important just because someone emails you

16:51.9

doesn't mean you're obligated to respond right no i know and but i i mean i want to and then

16:57.4

it didn't i mean that's that's the open source dilemma right you can't you know no one is owed

17:05.5

your attention but you get into it because you want to give them attention and how do you how

17:11.5

do you strike that balance and feel good about it and still make progress on the things you want to

17:15.2

do and stuff like that so i mean do you have there's a good question do you have wisdom on

17:19.8

I mean, like you've experienced Open Source Contributor, what would you say?

17:22.9

I'd go back to Will's point, which is that you are not obligated.

17:27.3

It is very easy when you put a project out there.

17:30.2

So as Will mentioned in the introduction, I'm the maintainer of Coverage.py, which is the coverage measurement tool for Python.

17:36.6

6.0, congrats, by the way.

17:37.9

That was yesterday.

17:38.7

6.0 came out.

17:40.7

Yep.

17:40.9

It was actually, well, there's a funny story about that.

17:44.0

I was being kept awake in an Airbnb in Brooklyn by a next-door neighbor who was having an electronic dance party until about 2 in the morning.

17:52.3

See, that's why you can't leave Boston.

17:53.7

You've got to stay, you know, shuts down at 9.

17:56.4

And while I couldn't go to sleep, I figured, well, I might as well shove out a release of coverage.py.

18:00.6

I meant to do it to coincide with 3.10 anyway.

18:03.8

And if I do it in the middle of the weekend, then maybe it won't crash too many Travis jobs the way major releases often do.

18:10.2

So I did it then.

18:14.1

So, yeah, so coverage.py, right?

18:15.7

So it's a big project in the sense that lots of people use it because there really isn't any competition for coverage measurement in the Python world.

18:25.5

Most people don't even realize that the standard library has a coverage measurement module in it called trace.py.

18:32.0

They just come and get a third-party module called coverage.py, which I maintain.

18:37.4

But as a result of that, there's a lot of issues that get written that I can't debug because they tend to be about esoteric execution environments.

18:46.1

Like, oh, my tensorflow something or other didn't get coverage measured.

18:50.0

And I'm like, I only understood half of those words.

18:52.4

And I don't have the time to go and figure that out.

18:55.4

so it sits there. It's very easy to feel like, well, I put this out in the world, so I need to

19:01.8

make it work. And if it doesn't work, then that's me being a bad person, not fixing it. And you have

19:08.2

to get past that feeling. You have to allow yourself to take time off or just turn away

19:15.6

from it for a while. There are months where I'm just not interested to work on Coverage.py,

19:20.3

So I just don't, and occasionally it'll feel like,

19:23.8

well, maybe I'm never going to get back to it

19:25.6

and maybe I'm just done with it.

19:27.2

And then something comes up and an interesting bug

19:29.8

or a release and I was like, oh, Coverage.py,

19:32.8

that's a cool thing.

19:33.6

I'll do some stuff on it, you know, it'll,

19:37.3

and if it didn't come back, all right, then it wouldn't,

19:39.3

you know, maybe there'll be a whole year

19:41.1

where it never comes back to me and that'll be it.

19:45.1

I don't know.

19:46.0

I see Carlton nodding.

19:46.8

I mean, you two are both uniquely in this bucket.

19:49.8

I mean, because Carlton, aside from Django, has a number of projects that he maintains over many years as well.

19:55.2

The big one is that I'm working to maintain at the moment is the channels project.

19:59.7

And what you talked about esoteric execution environments, it's exactly that.

20:04.0

Someone will post a bug and it'll be like, you know, an Nginx config and a few lines of some logging file, you know, and it's like, I'm sorry, I don't even know where to begin to respond to this question.

20:17.8

and so I've been trying to click move to discussion for those um because then it keeps the which it's

20:25.6

it's a new thing and it's like well it's not it's not like I just want to click close right but I

20:31.1

can't answer this it's not really an issue so maybe if I move it to discussion it keeps the issues

20:35.9

less interesting because one of the one of the sort of demotivators for me is when the you know

20:42.6

the issues are piling up and it's like oh I feel a pressure and it's like oh do you know what I

20:46.6

haven't got the emotional space to look even look at the repo whereas if i can if i can move those

20:51.6

issues over to discussions those unaddressable ones over discussion then maybe someone else

20:55.9

finds it searching and they're like oh i had this problem and i did this and that does happen

20:59.5

but then the issues can be like oh actually there's an addressable you know code problem

21:05.1

that's identifiable that's one thing i don't mind so the coverage.py repo i think has 200

21:10.1

open issues right now going back years i'm okay with that whatever um some people say you should

21:17.6

close them if you're not going to fix them the author can close it if they want to close it they

21:21.8

can close it um you know i think the signals are pretty clear in that repo your issue might not get

21:27.9

looked at um and so um one thing that has been great is that people maybe are getting better

21:37.1

at making reproducible test cases.

21:39.4

A few times I've gotten a Docker image

21:41.5

with the bug report.

21:44.0

Like here is the Docker image showing it happening.

21:46.5

And I'm like, great.

21:47.7

I know I can, now I can see it.

21:50.8

Because, you know, too often you get people saying,

21:53.2

look, you know, it failed like this.

21:55.2

Can you give me a reproducible case?

21:56.6

Well, here's a link to my Jenkins job that failed.

21:58.6

I don't, how many times, you know,

22:00.5

how far in do I have to click

22:02.1

before I can even maybe find the error message

22:04.4

you're talking about?

22:06.1

You know, if you can't put work into this,

22:07.8

why am I putting work into this?

22:09.5

So if you're listeners out there,

22:12.5

when you write a bug,

22:13.6

make it really, really reproducible, right?

22:16.1

Write the instructions for that art school friend of yours

22:19.1

who doesn't know how to use computers

22:20.4

and you are going to be doing the maintainer a huge favor.

22:24.8

That's my advice for everyone else.

22:26.2

You do yourself a favor as well

22:26.6

because the maintainer will run it

22:28.6

and then they'll get to the break point

22:30.5

and they'll be like, oh, okay, I'll click that.

22:33.0

You a few times.

22:33.9

Oh, that's my code.

22:35.1

Oh, why is that?

22:36.1

you know, you've got a chance.

22:37.9

And I understand that the TensorFlow people,

22:40.5

they don't know anything about how coverage works, right?

22:42.2

And I don't know anything about how TensorFlow works.

22:44.0

So we have to meet in the middle somehow.

22:48.0

That's fine.

22:48.9

I'm willing to dig into it.

22:50.1

It's interesting to dig into it.

22:51.3

There was a bug report about PonyORM,

22:56.6

which is one of these tools that they take Python code.

23:02.2

I think it's the PonyORM.

23:03.6

You write a query in Python code, and then they rewrite that Python code.

23:09.3

And so when your query is running, it's not actually your code running.

23:12.5

It's a rewritten version of your code running.

23:14.8

So when you try to do coverage measurement, coverage doesn't think your code is running

23:18.9

because your code isn't running.

23:20.8

It's been moved over into a parallel universe.

23:24.1

And that was very interesting to see.

23:26.1

I think that was the Pony ORM.

23:27.0

Yeah, it's called the smartest Python ORM.

23:29.3

Okay.

23:30.1

so i have a pull request against pony rm to make that a little bit clearer to people and they

23:34.9

haven't bothered to merge it yet even though it's really simple but there you go turn around

23:41.7

is fair play i guess right it's perhaps not reproducible you'll see um can i ask about so

23:49.4

you mentioned that python has its own internal sort of coverage testing tool has there been any

23:55.6

talk of... So we have this in the Django community where there's third-party packages and

23:60.0

Django is kept deliberately quite small. You only pull in things when you really need to.

24:06.4

Has there been any talk of pulling coverage into Python? I mean, I know that PSF has fellows and

24:10.7

stuff. I mean, it's pretty much a standard tool for anyone I know who does Python professionally.

24:16.2

Yeah, there hasn't been. So there's two things you might have meant by that. One is pull it

24:21.6

into the standard library and the other is move it under the PSF organization.

24:24.8

I guess I meant both, just as a question.

24:27.0

Right, either.

24:28.0

Yeah, so there hasn't been any discussion of either of them.

24:32.1

These days, I think Python is doing the right thing by being very reluctant to put things in the standard library

24:38.0

under the theory that third-party packages can evolve more frequently

24:42.9

and will keep the bloat down on the Python releases.

24:47.5

And it's just better to have things decoupled than to have everything piled into the box.

24:53.4

That, you know, we should all get better at installing and using third-party packages rather than hoping everything's going to go into the standard library.

25:03.0

And the PSF thing, I think that moving—so the two projects that I know off the top of my head that are under the PSF organization on GitHub are Requests and Black.

25:14.4

And I'm not quite sure—I don't know, that one always struck me as an odd thing.

25:17.9

I'm not sure why the PSF, which doesn't even own the Python repo, why it would own third-party package repos.

25:26.5

I don't know the story on Black.

25:28.1

I mean, on requests, it was because the maintainer was stepping away and there was some other stuff.

25:34.8

So I think it wanted to keep it going.

25:36.7

I'm not sure about Black.

25:38.4

But I know they have someone.

25:39.9

Sorry, go ahead.

25:40.9

Black was started by Lukasz Lankov, right?

25:43.6

So he's well in place there in the PSF and very involved.

25:49.4

He's hardly walking away from it.

25:51.1

So I don't quite know what it means when something is under the PSF organization.

25:57.2

Because it's not the PSF that's maintaining it.

25:59.6

Right.

25:59.8

I suppose it just means maybe a fun...

26:01.8

Well, yeah.

26:03.3

We always, I mean, we, Django, look to the PSF as sort of a big brother-sister in terms of size on these things.

26:10.7

Because many of the same issues around.

26:13.0

Yep.

26:13.6

organizing. Yeah. And edX is going through a similar transition right now as well. So

26:21.8

we often look to the PSF for how to do things. For instance, the PSF has

26:26.0

PEPS, Python Enhancement Proposals. Open edX has OAPs, Open edX Proposals, I guess we call them.

26:34.1

Not quite the same acronym. But so no, the moving coverage.py hasn't been suggested. And maybe

26:39.5

that's just because i keep maintaining it enough probably i mean everything's fine i guess right

26:46.5

it's not a squeaky wheel so it's not a squeaky wheel right i mean it's it's interesting maintaining

26:51.2

coverage.py because um i get involved with new python releases because coverage is intimately

26:59.0

aware of how different python releases run the trace function to tell to indicate what lines

27:05.4

got run and which didn't.

27:07.8

And 3.10, there's two reasons why I coordinated the Coverage 6 release with the 3.10 release.

27:18.0

Mark Hammond was doing a lot of work in 3.10 to fix, quote, fix how the trace function

27:25.0

traced things.

27:25.8

There were lots of weird edge cases, and he was doing a lot of work to fix that.

27:33.3

And when he fixed it, coverage.py would break because coverage.py was used to the old broken way that things were getting traced.

27:41.7

And so I think I wrote, I think, 10 different bug reports against Python 3.10 during its alpha phase saying it looks like this now.

27:49.8

Is this what you meant?

27:50.8

And about half of them were, yes, it's what I meant.

27:53.0

And about half of them were like, oh, no, that's not what I meant.

27:55.2

And so we were kind of going back 50-50, you know, does coverage have to change now or does Python have to change now?

28:01.8

And we finally got that all straightened out, I think, by the time of at least beta 2, maybe even beta 1.

28:07.9

So I was really happy to be able to push that out there.

28:12.5

But the other thing that coverage.py had to do for 3.10 is that 3.10 now has the match case syntax for doing pattern matching.

28:21.4

And that is a change in execution and, ironically, a change in syntax highlighting, which is something that coverage.py does along the way, too.

28:29.9

So there were a few, it was a very involved release

28:33.1

to get Coverage.py synced up with 3.10.

28:36.4

That's cool.

28:37.1

I mean, the new syntax in 3.10 is quite cool.

28:40.3

In Django, we don't get to use it for years

28:42.0

because obviously we're still on 3.6.

28:45.4

Right, exactly.

28:46.7

We don't get to use the new match case for quite a long time yet.

28:50.0

Yeah, I totally understand.

28:51.8

There was a time when Coverage.py would run on everything

28:54.7

from Python 2.3 up to, I think, 3.4 at the time

28:59.3

or something like that coverage six is mostly called six because i dropped python to support

29:04.4

so now i can use f strings for instance so we're on three six plus now as well i mean what's your

29:12.1

policy following the um because so python 3.5 was end of life last um end of last year and 3.6 is

29:19.9

going to be end of life this december i think and so what's your policy about dropping the support

29:24.9

No, with the new annual release cycle.

29:27.2

I tend to be very, very accommodating.

29:30.9

I know some people are like,

29:32.6

oh, Python 2's out of support on January 1st, 2020.

29:36.4

I am pushing my new version that drops Python 2

29:39.1

on January 2nd.

29:41.2

And I'm doing it, you know, for the good of everybody

29:44.1

because everybody should move up and et cetera, et cetera.

29:46.7

And my feeling is, look, I'm building this tool

29:49.4

so that people can use it.

29:50.7

And if keeping Python 2 support isn't a pain,

29:54.7

then, you know, why not just keep it?

29:57.8

Coverage 6, really what happened was that there was a different change in coverage

30:02.1

that felt like it was going to need a major version bump because it changed the behavior.

30:07.2

Coverage 5 and before would often accidentally measure third-party packages

30:12.3

that you'd installed in your site packages.

30:15.1

And in Coverage 6, we made a change that it was cleverer about where the code was coming from,

30:20.9

and it would automatically exclude things that had been installed

30:23.6

where third-party packages go.

30:26.1

And that's great, but some people said,

30:28.3

you know, actually that kind of broke my coverage

30:30.8

and I have to go in and change my configuration.

30:33.5

And so that felt like that should be a major version bump.

30:35.7

And since I was doing a major version bump

30:37.3

and it's been 18 months since Python 2 went away,

30:40.0

I figured I might as well, you know,

30:41.6

let's clean this up now and drop Python 2.

30:45.2

Folks can still pin to the old version.

30:47.2

And they can pin to 5.5 if they want.

30:50.5

Yeah, that's always the consolation is,

30:52.4

You know, you're not, by dropping support, you're not dropping anyone using your code because they can just keep using whatever code they were using.

30:59.0

I mean, super.

30:59.8

So there was an article in the news this week, you know, the tech news about, what was it, that coverage itself isn't a good marker of the test base.

31:11.7

And what they dug into was that it's the number of tests that help, like, you know, coverage is handy.

31:17.1

And I remember asking you a question on Twitter about this a few months ago about is there a way of sort of marking which tests are meant to cover what?

31:25.1

Because I might have a few Selenium tests, which are kind of big end-to-end ones.

31:29.1

And just by running those, I get quite a high coverage number.

31:34.1

Yeah, exactly.

31:34.7

So there's enough to riff on.

31:36.5

I guess I'd ask, well, what's your thought on that and how coverage works, you know, and how you build a quality suite?

31:42.1

And that is a tricky problem because, for instance, if even if you write one test and then run your project on coverage, you're probably going to get like 35% coverage just because all of your import statements will have run and all of your definition statements will have run.

31:56.1

And all of that gets measured.

31:57.8

And it doesn't mean you're one third tested yet.

32:01.0

And by the way, depending on what you do in your asserts, you might literally not be testing.

32:05.3

You might be running lots of code and not testing any of it.

32:08.0

right so um i didn't i didn't exactly read i think i saw that headline go by about the coverage

32:15.0

uh problem i didn't go and read the article um but your point is a good one that you can write

32:21.4

a test which exercises lots of code but can't actually assess the results of all that code and

32:28.0

if you had a way of marking a test saying that when i run this test i don't want you to count

32:34.1

the coverage or only count the coverage on this part of it yeah um and that there have been a few

32:40.5

ideas about that in the coverage um issue tracker none of which we've we've implemented the the big

32:47.3

i keep saying we but it's really just me um you and your alternate person that's right

32:53.5

yeah um i i code up whatever my rice krispies tell me to um

32:58.4

the big thing in coverage five was that you we the package can tell you um for each line of code

33:08.3

which tests ran that's amazing cover that line which kind of gets at that same issue right so

33:13.9

you can take a look at the coverage report and you can see oh that line was actually only run by

33:19.4

the integration test for instance it was that coverage that changing coverage five which

33:25.1

prompted my question was it could i yeah i somehow work this backwards such that i could say that i'm

33:30.8

writing this test and it's for that function and right and one of the things that's that's that

33:35.8

keeps me interested in coverage is those new kinds of ideas like that's a whole other interesting

33:40.6

question like what can i say about a test that could tell coverage to focus more accurately on

33:49.2

what that test is doing like some people will say well it should only measure the coverage of the

33:53.6

immediate functions it's called, it calls, but it shouldn't measure the coverage of any function

33:58.9

that those functions call. But that doesn't feel right. That's too crude a measure. And I don't

34:04.2

know if anyone's going to want to put the work in to say, well, for this test, only measure the

34:08.5

functions in that module. And then for that test, and you've got a thousand tests, you're not going

34:13.7

to go and decorate a thousand tests. So what's the right balance there between the effort from

34:20.8

the developer to say what they know and what they want about the coverage. And then how do you say

34:25.9

that in a way that coverage can make use of? That's a very interesting thing to me. And that

34:30.5

might be the next big thing, right? So coverage five had a lot of big changes. We switched to

34:35.2

SQLite and we put in the context measurements so you could say which lines were measured by which

34:40.2

tests. Coverage six had a lot to do with fixing the tracing with Python 3.10. What's next? I don't

34:46.3

I don't know, what's the big feature for Coverage 7?

34:48.1

Maybe it's something like that.

34:50.7

But I want to make sure that it's something

34:52.0

that people will actually get benefit out of.

34:54.6

you know you we can we can invent esoteric strange features and if no one uses them then

35:01.3

what was the point so yeah it's it's good to be guided by the questions people ask on the issue

35:06.2

tracker about doing these sorts of things yeah i mean so the flip side kind of question was that

35:12.9

so say i'm doing i've got these nice unit tests which are targeted specifically just to their one

35:17.8

function and then i realize i need to refactor and what i actually then need is an integration test

35:24.0

around the sort of the outside and i kind of need to napalm all those nice unit i mean what's your

35:30.7

thought when you face a challenge like that like how do you address that that kind of issue i mean

35:34.9

it's a lot of work unit tests unit tests by their design are tied to what the units do and so moving

35:42.4

the units around is going to require moving the unit tests in some way either you know getting

35:48.0

rid of half of them or writing 50 more of them or changing what they all expect and it's a lot of

35:54.5

work and and you know some people some people take the attitude like i'm not going to bother

35:59.1

with unit tests what matters is whether the application works as a whole and i'm just going

36:03.3

to write a bunch of big integration tests and what are the chances that a big problem will get

36:08.9

through those maybe it's a good trade-off of benefit versus effort it's hard to know and by

36:16.5

the way, so I should say, though, for contexts, coverage can tell you which tests ran which

36:23.3

lines of code, but you can also do things like run your integration tests and say mark all of

36:29.1

that coverage as integration and run the unit tests separately and mark all of that coverage

36:34.0

as unit. And then every line of code will either say integration or unit or both, right? So you

36:39.4

can decide sort of how coarse you want it to be. The problem with marking every test on every line

36:46.3

is if you've got a thousand tests

36:47.8

and you're going to have lines

36:48.6

that are covered by 200 tests,

36:50.7

there's no point getting a list

36:52.1

of the names of 200 tests for a line.

36:55.5

That's just, that's too much information.

36:57.1

I went and looked at my own coverage,

36:59.2

Coverage's own coverage HTML reports

37:01.6

and they're a single file

37:02.9

can have like a two megabyte HTML file result

37:05.5

because every line's got dozens of test names

37:08.8

annotated onto it

37:10.1

and it's just, it's not worth it.

37:12.4

So, I mean, a way of aggregating

37:15.9

that information right clump it up at a bigger level that's not a level that

37:20.2

coverage can intuit by itself but it gives you the controls where you can you can indicate those

37:26.2

things when you run the tests so that you get the information you want so just um the thought that

37:31.8

came up while you were talking about people arguing for integration tests there's a line i saw um a

37:36.2

few tests mostly integration something i can't remember the exact line but it was that kind of

37:40.5

idea is don't write too many tests keep them mostly integration i think that's riffing on the

37:45.1

the food yeah the pollen yeah right eat less mostly plants we'll swing over to edx in a minute

37:51.7

but like maintaining a big code base like that like thinking about how um how how you maintain

37:57.7

your tests and manage testing and manage the evolution of the code i think that's a really

38:02.0

interesting thing and i wanted to just pull out some of your thoughts on that i mean unless you

38:06.1

say more though please well i mean to switch it over to edx so so it's a edx is open edx the code

38:13.5

base is a very large project. I mean, the edX organization on GitHub has 300 repos.

38:20.2

There's probably at least a million and a half lines of code. The main repo is called edX

38:25.8

Platform, and it's a giant Django project. So I think it's probably got about 400,000 lines of

38:33.0

code right now, and a lot of tests. The tricky thing about me talking about the technology is

38:38.9

that i've been mostly working on community issues at edX for a long time so my my hands on the bits

38:45.7

of Django are it's pretty few and far between um but you know edX does struggle with the the bulk

38:53.0

of tests we we have pretty good coverage um so we've got really a pretty extensive test suite

38:59.4

and we're constantly uh rejiggering the sharding of tests across uh github action suites so that

39:07.7

We can keep the total wall time of running the tests to a reasonable limit.

39:12.8

I think right now it's about half an hour to run them all.

39:18.7

But it's – and there's been cycles of we have too many tests.

39:23.8

These aren't telling us enough.

39:24.8

Let's just get rid of these, which, I mean, to be perfectly honest, I was – I felt a visceral shock at the idea of just delete tests.

39:35.6

But the fact is that if they aren't telling you anything and they do take a long time, and especially these in particular were the sort of front end tests that tend to be kind of flaky, where not only are they not giving you any value, but they're taking up your time with false alarms.

39:51.5

So it was the right thing to do is to delete those tests.

39:53.9

And it's hard to strike the right balance because, you know, developers, maybe like many people, but especially developers, tend to be very susceptible to gamification.

40:04.9

You know, oh, there's a metric there.

40:06.3

There's a needle that moves that way.

40:07.8

I want to move it all the way that way.

40:09.8

And so, you know, I'm one of the main purveyors of one of those needles, right?

40:15.4

Coverage measurement.

40:16.2

What's your percentage?

40:17.4

You're not at 100% yet.

40:18.9

You know, you've got to get to 100%.

40:20.6

And that's a lot of work that may not provide much benefit.

40:25.3

So trying to strike a balance rather than just being the best or the last or the first or whatever is really hard.

40:34.8

I don't know if hopefully your team is as familiar with this, but Adam Johnson,

40:40.2

who's on the Django security team, among other things, has a whole book called Speed Up Your

40:44.6

Django Tests, which is, they should look at it. If they're not, it's very well suited to

40:51.0

a large organization and just, I mean, Carlton, we both read it. It's just so many things that

40:57.0

if you're on a big code base, it's like, oh yeah, that helps, that helps, that helps, that helps.

41:00.5

like um he's got a chapter on profiling that's worth it just for the chapter on profiling it's

41:05.6

just amazing i'll have to take a look at that make sure people know yeah we'll put a link um

41:09.5

in the show notes one of our one of our challenges at x is that we have we use something like 150

41:17.7

different third-party packages um to build upon and we try to stay on the latest you know we try

41:25.9

to stay on supported Django releases. We're in the middle right now of moving to Django 3.2 from 2.2.

41:33.6

But it's hard to move. I mean, aside from the question of what do we have to do to our code

41:38.2

to make it work on Django 3.2, we have to go and look at 150 third-party packages and see,

41:44.1

are they running on 3.2? And many of them are not because these packages, you know, they get stale.

41:48.9

And then we have to decide, can we get rid of that package? Can we live with it the way it is?

41:53.6

Do we have to fork that package?

41:56.3

And actually, one of our engineering leads here at edX, the name is Jeremy Bowman.

42:02.0

He's doing a whole talk at the upcoming DjangoCon about how we manage that and how we're trying to actually be proactive in the community to push out what we've learned about other people's packages and getting them onto Django 3.2 so that we can get onto Django 3.2.

42:21.8

So he's doing some interesting work, you know, pushing the envelope forward on how an entire community, not just a large million line project can move to Django 3.2, but how an entire community of third party packages can move forward onto Django 3.2.

42:38.1

Yeah, that's, yeah, I mean, it's difficult because you're maintaining a, you know, so I've just recently bumped AppConf, which I know edX depends on too.

42:48.6

And the only bump was to add the Trove classifiers for 3.2 and 4.0.

42:55.9

Like, but it's still, it's still like, you know, do it, clean up the, you know, clean up the last few issues, make sure, you know, update the packet.

43:02.4

It's still, you know, it's a session to do and it needs doing.

43:05.8

And okay, for AppConf, it's once every couple of years, do a release, it's fine.

43:11.2

But if your package needs actual updates, then it's all the more.

43:16.5

Right.

43:16.6

And for us, it's hard to look at a third-party package and decide, oh, is this just a missing

43:21.8

metadata or is it actually not going to work on 3.2?

43:25.7

A question I did want to ask you, Ned, about with testing.

43:27.8

I mean, I think we're all engineers.

43:30.3

We all value it.

43:31.9

Managers don't always.

43:33.4

You've worked in a lot of organizations.

43:35.2

What is your advice for an engineer who is trying to sell taking the time to testing to a manager who may or may not be technical and may not see the value in it?

43:45.8

That's a good question.

43:46.8

So to go back to that job I started on Django, the ideal job where the boss left for a week, that was the same boss who said that the way we're going to test our code is we're going to push it to production and people are going to tweet at us.

44:00.1

Maybe it wasn't tweeting at the time.

44:01.1

We're doing it live.

44:01.7

Yeah, we're doing it live.

44:02.9

And actually, that Bill O'Reilly clip of doing it live was a popular meme around the office.

44:08.8

Absolutely.

44:11.6

And so one thing that happened there was that they were talking about, you know, we're going to ship in a month.

44:16.9

And I was looking at the software and I thought, this is not going to be ready in a month.

44:20.5

But how do I convince them of that?

44:22.6

And what I did is I brought in, I said, we should do some usability testing.

44:26.6

And I got some friends to come in and be the subjects in the usability tests.

44:31.1

And I ran the usability tests.

44:32.4

And I didn't know much about running usability tests.

44:34.6

But, you know, I knew enough to sort of pull the wool over those guys' eyes on the usability tests.

44:40.9

And, you know, I said, look, this guy is sitting down to use the app you said we're going to ship in a month.

44:46.8

Are we ready?

44:48.6

And they're like, okay, I can see it.

44:49.9

We're not ready.

44:52.2

So if the manager is not convinced that testing is worthwhile, keep an eye on the production outages.

45:01.1

and try to convince them to do a root cause analysis

45:05.3

and have the analysis come up with

45:08.3

what is probably the answer,

45:09.6

which is if we'd known that X and Y and Z earlier,

45:13.9

we could have prevented this.

45:17.4

The edX engineering culture is great that way.

45:21.5

We do RCAs all the time and it's a blameless culture.

45:24.6

And I don't know if I've ever heard someone say,

45:28.4

let's just build the feature and get it out there

45:30.4

And then we can write the tests later.

45:31.9

You know, we just have to get the feature done in time.

45:34.6

You know, those sorts of ill-advised trade-offs that you can sometimes hear pointy-haired bosses making.

45:42.4

And that just rots the morale of the engineers, too.

45:45.2

I mean, because I came into technology through the business end.

45:49.0

And I sometimes talk to MBAs who go and be product managers.

45:51.9

And I try to tell them the story of, well, when you go and manage a team of engineers and you're not an engineer, it's very easy.

45:59.5

there's a new product that your boss says, you know, get it done in two weeks, your engineers

46:04.0

say it'll take four weeks, and you crack the whip and find a way to motivate them to get done in

46:09.3

two weeks. And you think, wow, I just got to crack the whip on these lazy engineers. And the engineers

46:14.3

think, well, this person doesn't know anything, doesn't value testing, the code smells, the tech

46:19.0

that accumulates. And so both sides lose, even though the manager thinks they win. Right. And so

46:25.7

So, you know, having things like, my advice is generally like, because the issue is you want to, when you're in charge of managing the team, like that's the onus is on you.

46:35.3

You want to have an output or an outlet for that.

46:38.9

So I would often recommend saying you need to have like a bug day every month, every couple of weeks, so that the engineers have to prioritize it.

46:46.5

Because every bug is important to an engineer.

46:48.1

But like, how do you, when you're not technical, figure out which bug really matters?

46:51.6

You say, okay, I'm, you know, moving heaven and earth to give you the time and we'll celebrate it and get a gong or something, but you need to prioritize the bugs.

46:59.3

And then you're not just saying no all the time.

47:00.9

You're saying, okay, yes, at this date, and then we'll, and that sort of doesn't solve the problem, but that helps morale a lot.

47:07.1

And it can also sometimes be effective versus, you know, every bug that comes up is an important time off and we'll help you with that.

47:16.0

Right.

47:16.8

Yeah, and giving the engineers some measure of autonomy over some part of their effort is a really good thing to do.

47:26.3

And by the way, the startup that I'm talking about, he was not a pointy-haired boss.

47:31.6

I like to think of it as sort of a healthy debate over the costs and benefits of alternate approaches.

47:39.4

And we shipped a good product, I think.

47:42.5

Like I said, we got acquired by Hewlett Packard.

47:44.3

It's not around anymore.

47:45.8

Yeah. Well, it's like 100% test coverage, right? I mean, it's a goal. The truth is it's somewhere south of that, but the question is where.

47:52.9

Right. Exactly. Exactly. Yeah. By the way, when we got hired by Hewlett Packard, they looked at us and they said, oh, this Django Python thing, how quickly can you port it to Java?

48:04.8

Luckily, those people went away and we never heard from them again. But there was way more craziness after the acquisition by Hewlett Packard than before.

48:14.9

Yes. I think that's something that from the outside, perhaps it seems that large corporations are more stable on the individual level. And most of my experience is the opposite. So we're coming a little bit up on time. Are there any topics that we haven't asked you about or things you want to mention while you're talking to the Django community?

48:33.5

um well so the django coverage plugin yeah is a thing um and uh it's interesting so so the the

48:42.8

cool thing about uh django is it's got these templates and templates aren't meant to have

48:49.0

logic in them but they can have a little bit of logic in them you've got if statements and loops

48:52.9

and things um and there's a plugin called the django coverage plugin that will tell you which

48:57.4

aspects which parts which lines i guess of your templates have been used in your tests and which

49:03.0

have not. And it's an interesting project in and of itself, because when I first wrote it,

49:10.1

I took the strategy of, I don't need to stick to public interfaces. I can use whatever I want

49:16.1

that I find inside Django, and that's fine because it's on me, and I'm taking responsibility for

49:21.2

that. And one of the problems with the 6.0 release of coverage is that I heard from at least two

49:29.2

different people who said 6.0 broke my thing. And I looked at it like, yeah, your thing was using

49:33.8

private stuff that I didn't tell you to use. It's not my fault. I don't have to do anything about

49:38.4

that. You're welcome. You just say you're welcome. It was free. On the flip side, Django coverage

49:44.7

plugin does use internal stuff from Django. But like I said, I knew going in, that's what I was

49:48.9

doing. And when it doesn't work anymore, because Django has shifted, I update Django coverage

49:53.8

plugin to do a different thing and i have a testing strategy where every every week on sunday

49:59.6

it runs with the latest tip of django and if it breaks i will hear about it and i can update it

50:05.7

quickly before it becomes a problem ironically that test run this week only told me about how

50:11.0

coverage 6 broke it and not how django had broken it but you know that's that's what tests are for

50:17.3

to tell you this stuff broke the universe changed and your stuff doesn't work anymore so well the

50:21.9

template engine doesn't change very often but there's a bit of work going on at the moment to

50:25.9

optimize you know various bits you know make it a bit bit more performant and so maybe you'll have

50:31.1

some work i understand that you you might be surprised to hear that inside coverage.py is

50:36.1

its own template engine partly because it was just fun to write and partly because i didn't want to

50:42.1

have any third-party dependencies in coverage.py but i wanted to make nice html pages yeah great

50:48.3

well we'll definitely link to that anything anything else well i just i don't know if we're

50:53.7

going to talk about it much but i wanted to talk to you about cog which is uh one of your other

50:58.0

little projects which you claimed at the beginning your other projects aren't as interesting but cog

51:01.4

is an amazing little code generator that so yeah so quickly cog cog started when i was working at

51:08.4

a startup that was doing mostly c++ code and we had the need to have a sql schema and python and

51:15.8

C++ code that matched. And I wanted a way to generate the two from something. And the something

51:23.1

was going to be Python. And I tried using Cheetah, if you remember that templating engine, but that

51:28.3

was for text and not for code. So it was difficult. And so instead I came up with this thing called

51:34.1

cog, which is basically a way to have a text file in which you could embed bits of Python and it

51:40.0

would run through the text file and execute the Python. And whatever the Python generated would

51:44.1

go would replace the python in the output and it worked great for that for the c making c plus plus

51:51.6

and sql and i actually use it to make my pycon talks so i author my pycon talks in a giant html

51:58.3

file using a javascript based slide package and there's bits of python in the presentation that

52:06.5

generate parts of the slides so when i want you know a diagram or a table that's easier to generate

52:12.4

with code then by hand I put in the Python code and it generates that stuff.

52:17.4

So it lives on in kind of a much

52:22.0

different environment and everyone's and it is it's one of those packages that i hardly ever do

52:27.1

anything with but about once a year i hear from someone who's using it for something or wants to

52:31.9

make a change and so it's got kind of this tiny but dedicated following so yeah that's i guess i

52:38.2

had forgotten about cog that's that may be my second most successful side project it's just

52:42.3

super like i don't know i've used it for generating like um javascript clients from you know okay

52:48.7

there you go um that's that's similar to its original purpose in life yeah and it's like

52:53.5

this is you know whenever i find myself writing like really boilerplate so i'm just repeating

52:59.4

myself here yeah exactly you know reach for like cogs are just an awesome little tool so i wanted

53:04.7

to thank you thank you for that sure i think that's like levels like they say that you know

53:09.2

developers want to do a blog so you end up building your own static site generator and

53:13.1

this is like you doing that for talks so i respect that exactly i'm constantly on side sides of sides

53:20.8

of sides projects to make nicer looking diagrams in my blog posts on my django hosted self-written

53:27.7

django side project site etc yeah no but that's great that's what you you know because from that

53:33.8

comes coverage.py and who knows what else right if you lose that sense of play that's right yes

53:40.7

It's very helpful to have a side project where either you can do it exactly right because you can't in your day job or because your day job makes you do it exactly right in your side project, you can do it wrong.

53:54.5

Like, just having that outlet for whichever thing.

53:57.7

That's right.

53:58.5

Testing.

53:58.9

I don't need that.

53:59.6

Branches.

53:59.9

Everything is master.

54:00.8

Like, let's go.

54:01.5

Yeah, exactly.

54:02.5

Screw it.

54:02.9

That's, like, my site.

54:04.5

Branches.

54:06.6

My personal site.

54:07.3

Not mine.

54:07.8

site here because we we let you slip this by then but you're rewriting that and that was

54:12.7

what was the what was the old you're rewriting that as a straight dango site now right well so

54:18.0

it's more complicated than that it so my my personal site started in 2002 as a bunch of

54:24.8

python code that generated html that i would ftp up to so you did have a static site generator

54:29.9

it wasn't well originally yes this is pre-freeze right was it flask a flask freeze i think right

54:36.0

There's a Flask Freeze way you could do it.

54:37.6

It wasn't, well, there wasn't a web.

54:39.9

I forget what it was originally.

54:41.9

It was all XSLT and yeah.

54:45.5

Okay, but don't laugh

54:46.5

because it's still all XSLT.

54:48.5

Yeah, that's what I wanted to ask

54:49.8

because it's still the same thing.

54:51.3

It's still XSLT.

54:52.9

So it started as just like,

54:54.5

let's use XSLT and generate a pile of HTML,

54:56.9

FTP it up there.

54:58.0

And then at some point,

55:00.8

I switched it to being a Django site

55:02.9

that I would use a static site generator

55:05.3

with locally to generate a pile of HTML that I would FTP up to the site and to do comments that

55:11.4

had PHP code in there too. And there was some interesting Django middleware that would execute

55:18.0

PHP along the way for local testing or something like that. Yeah, it got really wacky. And then

55:24.8

this summer, my hosting provider said, we're kicking you off. They were getting bought and

55:32.0

And they said they could transfer my site and then they said they couldn't transfer

55:34.5

my site.

55:35.8

So you got to find a new host.

55:36.9

And so I went looking for, so I thought, fine, I'll just do a real Django site.

55:41.9

I made one little stop on, maybe I can just move the site as it is to a new host, but

55:46.5

the old site was PHP five and the new host was PHP seven.

55:50.7

And I definitely didn't want to invest any time in understanding how to upgrade PHP.

55:55.4

So I bit the bullet and I made it a real Django site.

55:57.7

It's really hosted on a digital DreamHost.

56:02.5

Sorry, DreamHost.

56:05.3

And so I had to reimplement the whole comment system, which was cool.

56:09.2

And now I can do better things with the comments and et cetera, et cetera.

56:12.8

But it's still very wacky.

56:14.6

So I still generate a bunch.

56:18.0

I import a lot of XML files into a SQLite database.

56:21.8

The SQLite databases are synced up to the server where the Django site will use XSLT to convert the XML into HTML and serve it.

56:32.9

So all sorts of wrong decisions, but it works.

56:37.7

But like 10 years of, you know, just bolting on a new bit.

56:41.3

Yeah, 20, 20 years.

56:42.8

It'll be 20 years in spring.

56:45.1

But now I can do cool things.

56:46.8

If I have images, I can auto-convert them to WebP and serve them as WebP from then on.

56:54.1

So the first hit does the convert, and then everyone else gets a better image, and I can

56:57.5

just do JPEGs locally.

56:59.8

So, you know, cool stuff.

57:02.1

Love it.

57:02.9

Maybe it's the last question to take us out.

57:05.3

So edX, I think this is public, has just been a non-profit that's been acquired by a for-profit

57:11.2

company.

57:12.1

I'm just curious what you can say about that, and how do you envision that changing your

57:16.0

role at all. Yeah. So this is a huge topic. But I mean, honestly, I pretty much mostly,

57:24.4

I know what's public. Yeah. Not much more than that. You know, people, so my role at edX has

57:31.3

been sort of the face of Open edX within edX. And then outside of edX, my role has been sort of the

57:38.1

face of Open edX from edX. So I think of myself sometimes, some people think that I'm like the

57:45.2

big brains behind Open edX or something. I have called myself the sidelines mascot for Open edX,

57:51.9

like I'm the guy in the suit dancing around as hard as I can to get everyone excited.

57:57.8

So when the announcement was made that 2U, an ed tech company, was going to acquire

58:03.4

at edX, I think that was at the end of June, people were getting in touch with me like,

58:10.6

you know, did you organize this? And I was like, I, I found out about it when you found out about

58:15.0

it. You know, when the news went public, that's when I knew. Yeah. Sometimes you're last to know

58:20.1

when you're on the inside. Yeah, exactly. So yeah, 2U is acquiring edX, but edX is a non-profit

58:27.1

and a for-profit company cannot acquire a non-profit company, or at least, I don't know

58:33.2

the legality, but what's actually happening is that there is going to be a new non-profit formed,

58:37.5

which will get all of the proceeds from the sale because you can't create a nonprofit to further education

58:43.7

and then sell it to a profit company and then me keep the money.

58:47.8

I don't get any of the proceeds of the acquisition because it was a nonprofit.

58:53.4

The money that went into the nonprofit has to continue the goal of the nonprofit.

58:58.0

So there's going to be a new nonprofit who continues that goal of education, furthering education.

59:04.9

And most of edX is going to go to you.

59:09.2

There will be people who work for the new nonprofit.

59:11.5

We don't know who those people are.

59:12.9

We don't know who's going to run the nonprofit.

59:14.8

We don't even know exactly when this is all going to happen.

59:17.6

It's under legal review by, I guess, the Attorney General of Massachusetts.

59:24.1

But it's going to be an interesting time.

59:25.8

So one of the challenges of edX and Open edX is that, I mean, and this is a classic problem

59:32.5

with any Django project is that you set out to build a platform,

59:37.9

but mostly what you're building is an application, right?

59:41.7

So like Django apps, oh, I'm going to have a blog app in my project.

59:47.8

And your blog app is never really just a blog app, right?

59:51.5

I mean, it's not general enough to be used by other people.

59:54.3

It's always got connections to the rest of the project.

59:57.1

And you hard-coded the name of it in there

59:59.6

because it was just too much trouble to make it a setting,

60:02.1

the right way. And you don't know what the API should be. Yeah. Yeah. Okay. So, and, you know,

60:11.4

Open edX is no different, right? So way back in 2012, you know, edX was building a site to serve

60:19.0

education and it was a Django site. And the goal was always to open source it so other people could

60:24.1

use it. But, you know, mostly it was just edX that we offered to other people to use. And over time,

60:30.3

we've gotten better about making it more of a platform and more generalizable.

60:35.9

But all along the way, we were deploying from master, and we deploy multiple times a day from

60:42.4

master today to edX.org. And that's, you know, the business that pays my salary. So it's very

60:48.2

important that we keep it running. Meanwhile, we also want to get contributions from people

60:52.1

into that open source repo. But how do we manage that risk? Right? They're not running a site that

60:57.9

has 30 million, 40 million, I don't know the exact count now, 50 million learners right now.

61:02.2

So they're not dealing with the kinds of scale we're dealing with. They don't know what our

61:05.7

roadmap is. So up until now, the contribution process has been very tightly controlled by edX.

61:11.0

Every change has to be reviewed by edX, roughly. We're opening it up a bit now, but for the most

61:15.9

part. Now that there's going to be a separate nonprofit that's going to own Open edX, separate

61:22.6

from the edX company that's running the edX.org website, what's the new contribution model

61:28.1

going to be?

61:28.5

What's the flow going to be?

61:29.6

How do we keep code moving at high velocity and keep the business stable under two separate

61:37.5

legal entities, one of whom is going to want, for the most part, to increase contribution

61:42.3

and one of whom, for the most part, is going to want to keep the business stable?

61:46.1

So that's a whole new open source dynamic that we're going to have to navigate.

61:51.9

So it's a very interesting time.

61:54.4

It was when your colleagues came on, you know, a year or so ago and talked through the OpenExit.

61:59.3

I went and checked it out and downloaded and tried to get up and running, but it wasn't easy to get up and running as a new contributor, as a new, like, you know, I know my way around Django, but it was like, well, bang on, this is tricky.

62:12.2

Yeah, it's big.

62:13.3

It's big.

62:14.2

And, you know, like I said, for the most part, edX engineers are focused on how can we make edX.org a little bit better today, not how can we make it easier for someone else to run another site that also does education.

62:25.6

Like, no one's opposed to that goal.

62:28.0

It's just probably the fourth or fifth goal in their list, and they're not being measured on that goal.

62:33.6

So it's very easy for it to lag.

62:36.0

I think that's I mean, I've I have heard that something of a similar dynamic is is off is even the case with, say, React, which is us within Facebook.

62:44.5

But it's the engineers there want to do that.

62:47.2

And Facebook itself sort of does it to humor them, but doesn't really care.

62:51.1

And so from the outside, it's like, well, Google or Facebook or Microsoft supports this open source package.

62:56.0

And really, it's probably a handful of people on the inside fighting, you know, they're getting paid by those companies.

63:02.2

But, you know, their boss isn't saying, nice job on React, release.

63:07.9

You know, it's like, okay, when you're done with everything else, you can maybe do that if it helps us hire people.

63:13.0

Yeah, and that's one of the things we're constantly, I mean, for eight years now, we've been trying to find good projects that we can compare ourselves to.

63:23.3

In other words, you know, the problems we have, are there other projects that have those problems?

63:28.0

How are they solving them?

63:28.9

Can we use those solutions too?

63:30.6

and there's always differences between the projects that make it not quite a direct map

63:35.2

and so it's hard to find analogs but actually that the the example of react coming out of facebook

63:40.2

actually was a recent one that we discussed i don't know how we'd actually get to the bottom

63:44.6

of that like how do we get into the cubicles at facebook to see what their actual you know

63:50.1

goals and measurements are and how do they support it maybe we should just go ask i don't know

63:55.0

I don't know.

63:58.8

But it's fascinating to – I mean, it's one of the tricks of open source is that it's not – there's no one way to do it.

64:06.3

And there's sort of the classic way, you know, just do it like Red Hat does it.

64:10.8

Well, Red Hat doesn't deploy to master multiple times a day.

64:14.3

Well, just do it like WordPress does it.

64:16.2

Well, that's a little different.

64:18.6

So, you know, it's interesting to work in open source and have to sort of rethink it from first principles to make it work for everyone, right?

64:30.7

edX loves open source as a way of expanding our engineering capacity, right?

64:35.4

Instead of having, I don't know what it is, 100, 150 engineers at edX, we could have 800 engineers among all of the people using open edX.

64:43.4

But we only get those 800 engineers if we can coordinate their contributions in a way

64:48.5

and open up the channels so that the contributions can flow.

64:51.9

And that means we have to tell those 800 engineers how to get started, like Carlton said.

64:56.5

We have to tell them, well, what are we interested in?

64:58.6

Where are we headed?

64:59.2

What's the roadmap?

65:00.4

And edX works like most companies.

65:04.5

They're the edX employees, and we all talk to each other.

65:06.9

But how do we talk beyond ourselves so that we can get those 800 people?

65:10.9

That's the big challenge.

65:13.3

And we're entering a whole new phase of that with this split of the acquisition and the new nonprofit.

65:18.4

I mean, it's a really good example.

65:20.2

If you want, you know, someone was asking on Twitter just today about, you know, what are examples of big Django sites?

65:25.7

Well, Open edX is a really good repo.

65:29.0

There's a lot going on there.

65:30.7

The problem with people, people want those examples because they want to know how to do it well.

65:36.6

And it's easy to find large projects and it's easy to find good projects.

65:40.6

Finding good large projects is really hard because the large projects have been around

65:44.0

for a long time.

65:44.9

And so they've just acquired lots of, first, they've got the archaeological layers of how

65:49.9

best to do it, right?

65:50.9

So way down at the bottom, you've got the function views, and then maybe we've got some

65:54.0

class-based views.

65:55.0

And the settings files, like we said, have been through their evolution, but they've

65:59.5

also just accumulated the tech debt and the cruft from having, you know, 100 people work

66:03.9

on it for eight years.

66:04.8

And it's just hard to keep everything with one voice and best practices spread across

66:10.1

a million lines of code. But isn't that what being an expert in coding feels like, is that you still

66:15.7

feel the same? It's just that people ask you and you realize there's no solution, so you just have

66:20.0

to pick one as an expert. That's how I feel. When people ask me something, I'm like, I don't know.

66:26.1

But then I ask around and Carlton doesn't know, Jeff doesn't know, a couple of other people,

66:29.2

Adam doesn't know. It's like, it's unknowable or there's no best practice because I just checked.

66:35.4

Yeah, exactly.

66:36.5

Well, and sometimes we've got a joke

66:38.6

among architects at edX

66:39.9

that it's always the same answer.

66:41.9

The answer is, it depends.

66:44.0

That's the tagline for our podcast.

66:45.5

We haven't said it yet.

66:46.7

We always, we try really hard not to slip it in.

66:50.4

Right.

66:51.3

But even like beginners come in saying,

66:53.6

well, what's the best way for me to install Python

66:55.6

and get it ready for my project?

66:57.7

Well, it depends.

66:59.4

Are you doing scientific work?

67:00.7

Conda.

67:01.3

Are you comfortable with the command line?

67:03.0

Well, you might like virtual env wrapper.

67:04.6

But otherwise, why don't you just go into PyCharm and pick new project?

67:07.7

You know, it depends.

67:09.9

These things are complicated.

67:11.5

Right.

67:11.6

And that's the kind of thing that a newcomer, they just want to work.

67:15.2

Right.

67:15.6

And that's not just because they're a newcomer.

67:17.4

It's because it's the part of the thing they're not interested in, right?

67:20.1

I'm not a newcomer.

67:21.0

I would love to not have to think about virtual env wrapper and just have it be solved for me.

67:27.4

It seems like about once every six months, I have to revisit how I get Python installed

67:31.3

on my computer and how I get virtual

67:33.1

in those versions of Python

67:35.1

and blah blah blah. Yeah, as we wrap

67:37.3

up, I think Python has gotten

67:38.5

better in that, you know,

67:41.1

A, just use Docker. B,

67:43.2

if you just need one version, you can just

67:45.2

go grab the official installer. That works

67:47.2

fine. And then if you need multiple versions

67:49.1

like a regular person, you can use PyEnv and

67:51.1

you know, I guess Poetry if you want to get fancy.

67:53.4

But that third case,

67:55.2

hopefully you, it's not

67:57.1

your first time installing Python if you need to have multiple

67:59.2

versions to work on stuff.

68:00.6

Yeah.

68:01.3

Yeah, but you just described the better way. And I think you had three conditionals

68:05.5

along that path. Well, it depends. Yeah, it depends on what they need. I mean, I think

68:11.1

it depends. I mean, so I've been thinking about this for my Django books, because I have a section

68:15.3

on, you know, how do you get to Django? It's like, well, I got to slog through Python. And I have

68:19.4

switched away from Homebrew on Mac and just the python.org installer actually, I think works quite

68:24.9

well. And, you know, yeah, there's all these things it doesn't do. Like, how do you switch

68:28.6

python versions it's like well there's a whole universe of thought on that but yeah we're exactly

68:33.5

doing clean greenfield stuff and so it's you know django 3.2 and python 3.3.10 soon and

68:40.0

don't have to worry about it yep well yep anyway thank you so much for taking the time to come on

68:47.6

i know we've gone a little over but i appreciate we had a lot of questions around testing and edX

68:51.9

and again you were one of the very first people we wanted to have on this podcast so i appreciate

68:56.4

you taking the time. So now the whole podcast is done. You can just, you can just, this is the,

69:00.9

the series finale. Yeah. It's like coverage six, you know, we'll take 18 months off and

69:05.4

now this is, this is a lot of fun. I always, I always enjoy doing these and it's great to

69:11.6

have a chance to have a longer discussion. I know we tweeted each other and maybe even see each

69:16.6

other in IRC or discord sometimes, but, uh, having an actual discussion with paragraphs and replies

69:22.1

and thoughtfulness is great.

69:24.0

Yeah.

69:24.2

And I mean, I sort of,

69:25.5

the goal is to lose people.

69:26.9

Like I have passive people who are like,

69:28.5

oh, coding, you have a podcast.

69:29.8

Like let's go listen.

69:30.9

And they get five, you know,

69:31.9

PhDs and something else.

69:33.2

You get five minutes in there.

69:33.9

Like you lost me.

69:34.6

I was like, good,

69:35.3

because I'm not trying to appease you.

69:38.1

That's right.

69:38.7

You're not the audience.

69:39.8

Yeah, that's right.

69:41.0

If you are interested in the podcast,

69:42.5

DjangoChat.com,

69:43.8

ChatDjango on Twitter.

69:45.5

And we'll see you all next time.

69:46.8

Bye-bye.

69:47.5

Join us next time.

69:48.6

Bye-bye.