← Back to Show Notes

Transcript: Ethical Ads - David Fischer

Hello, and welcome to another episode of Django Chat, a weekly podcast on the Django Web Framework.

I'm Will Vincent, joined by Carlton Gibson. Hi, Carlton.

Hello, Will. Long time no see.

Yeah. And this week, we are joined by David Fisher from Read the Docs. Hello, David.

Hello. How are you?

Marvelous. Thanks for coming on, David.

Yeah, thank you for coming on. There is a whole bunch of things I want to ask you about,

But maybe we'll just start with your origin story. How'd you get into programming, Python, Django, and then we'll go from there.

Oh, all right. So I have a pretty like standard origin story.

You know, I sort of studied math in college and loved programming before that, wanted to get into it, sort of had a programming heavy undergraduate and went right into like a job at a pretty big tech company right out of college.

So that was a while ago. But now I sort of decided, you know, I'm done working for these big companies. You know, I'm not like super negative about them or anything like that. But I wanted to get my hands a little dirtier, wanted to be a little bit closer, maybe to both the business side and the code side.

And so I started working for more on the startup, startup size companies and, and read the docs

actually a couple of years back, about not quite three years ago, it was a pretty natural

fit for me as they were sort of building out their advertising platform.

Got it.

And when did Python fit into your programming journey?

Oh man, I took like one class that touched on Python in college and it actually helped

me get my first job, which is sort of a random thing.

I did Python in one class, and then this job application was like, people who know Python.

I was like, that's me.

I know Python.

You know, as much as an undergraduate who doesn't know Python knows Python.

But it sort of helped me get my first job.

Didn't do Python at that job for years.

And then maybe four or five years into that job, they were like, do some Python.

So that was how it ended up working.

I was already working on some web stuff, and I ended up picking up Django.

I sort of like looked at a couple of the alternatives at the time and Django, this is in the 096 days, was the, it ended up being a great fit.

So that was how I got into Django.

It was a long time ago.

Carlton, what version was it for you?

I don't know, about 1, 1.1, 1, something around the 1.0 times.

Because I remember going to a conference and there was this, it was like Django, Django, Django.

So it must have been around the 1.0, around that time.

So maybe for listeners, what is maybe Read the Docs?

What's the quick story on that if they're not familiar?

I'm sure most of you listeners have seen Read the Docs, but maybe they don't know the story

of Read the Docs.

Yeah, it's sort of an interesting story.

And I'm not the main sort of protagonist in this story at all.

Sort of Eric Holscher and Anthony Johnson are sort of the main folks in this story.

I came much later, although I was an early user of Read the Docs.

I think I started using Read the Docs in like 2011.

I, you know, had an account.

You know, I sort of look back in my password manager.

It's like created date.

I think it was 2011 or something like that.

Yeah, exactly.

That's how you figure that out.

But it, I think it started at like Django Dash in 2010,

which was like a one weekend sprint project.

And it was basically like the automation around Sphinx.

That was exactly what it is.

So how do you do continuous documentation, not building off of every commit to master or, you know, something like that?

And so Eric Holscher, I think at the time, was probably one of the main people behind it.

There was a couple other people at that time who ended up not sort of transitioning, not launching it as a company.

And it just sort of, it took off in the Django community and in the wider Python community.

People started using it and Eric had a day job, you know, working a regular job, like a regular person.

And he would get these pages or calls being like, hey, the docs is down.

And he's like, well, you know, I got my regular job, guys, you know, like.

So eventually, it was so big, so many people relied on it.

I think Eric and Anthony decided to launch a company around it.

And basically, it's funded through a combination of advertising on open source documentation.

They sell sort of ad-free commercial documentation that has a few additional features so sort of companies can buy it.

And then there's also sort of like what's called Read the Docs Gold, but it's basically like regular people can pay a little bit of money and you get some extra perks on Read the Docs, like ad-free on Read the Docs.

And you can sometimes designate a project ad-free.

There's a few different things you can do.

So that's sort of where it all comes from.

And, you know, I sort of looked at this and Eric had produced sort of this guide on how their advertising was different from other people.

You know, it was sort of advertising without tracking people.

So pretty much the opposite of what most advertising does.

And I sort of just emailed Eric out of the blue.

I'd met him before at a conference like, you know, probably seven years before.

I'm sure that he had no idea who I was and I barely remembered who he was at that time.

but i emailed him to ask him a few questions about that and he was basically like you you should you

should work our advertising and i was like yes i should so that was that was how i got the job at

read the docs i sort of did a few projects as a consultant for them and then transitioned over to

to full-time before we go off into the ads story which is there's much more to talk about there

can i just wind back a little bit in the early days was it like eric just like hosting this

on his own server?

Like, because it grew up quite big quite quickly, right?

So where's the cost?

Great question.

It's all right if you've just got your blog,

but you're hosting, but when you're hosting-

And this is where a number of the problems came from.

You know, so it was, when I stepped in here,

I was like, man, this is hosted insanely cheaply.

You know, they're doing, you know, 50X or even more,

maybe a hundred X the traffic of some of the other properties I've worked on before. And they have a

budget, like a monthly, you know, infrastructure budget. That's like the same as other places I've

worked that did, you know, 2% of the traffic that Read the Docs did. So it was, it was built to run

insanely cheaply. Now it has some advantages. Read the Docs is almost entirely hosting static

content that is already built, you know, just straight HTML, CSS, JavaScript that's built

through something, built through a builder. So yes, you have to run builders. That's kind of

expensive. But the hosting and the serving is relatively inexpensive. So there are some

advantages there. But yeah, read the docs. Our stats are pretty public. We're up front about

them. It did something like in August, 45 million page views last month. So a lot. It does a lot of

traffic is the answer. And that's just the open source side of the hosting. That's not the

commercial hosting. And we have some privacy protection in there to like, you know, we don't

send anything to analytics if somebody has do not track marked on their browser, things like that.

So it doesn't count any of that. You run an ad blocker, you don't get counted. This is 45 million

page views discounting all of that. So it does an absolutely massive amount of traffic.

And tech people are quite likely to run.

So it would go down all the time because it was hosted on a shoestring budget.

Eventually, Eric sort of moved it over to AWS and had like load balancers, you know,

like things that you would do if you wanted something to stay up.

Yeah, crazy stuff.

Yeah, crazy stuff.

And this worked pretty well on still what was at the time a relatively small budget.

But they launched sort of some crowdfunding campaigns.

Some of these brought in like real amounts of money.

You know, they did like a big crowdfunding push and they got like a real amount of money.

it was like $30,000, which sounds like a lot until you realize that's like a few months of

infrastructure budget. But it was all one-time donations. And so then the month after they got

$30,000, the next month it was like, oh, we brought in $1,000, which is, that's going to

be underwater on the infrastructure budget. So they realized, what do we do here? And the answer

was mostly advertising. Well, I remember that post, maybe if I find it, I'll put it in the links

about, you know, talking basically kind of what you said in the, you know, not joyously jumping

into ads, but basically being like, we need to cover the costs. And I remember thinking it was

really well written and really sort of laid out the dynamic for a lot of people, which is, yeah,

it's hard to charge and can't lose money on something that's... Yeah, that's a side project,

essentially, or it was at one time, you know? Yeah, you're absolutely right. And there is sort

of the reality of the situation that, you know, the budget of Read the Docs is a rounding error

to somebody like Google. It might even be a rounding error to somebody like GitHub, even

pre-Microsoft acquisition. So, like, they could just launch it and if it loses money, whatever,

no big deal. But, you know, for us, it's like real money. You know, commercial hosting does

not bring in as much money as advertising for us right now. Wow. That's interesting because...

Yeah. Yeah. Huh. And, you know, like, we don't have venture capital backing us. There's no sort

of like money banks, daddy money bags behind us or anything. It's like, oh, we bring in a little

bit more money. That means we can hire one more person. So it's all sort of bootstrapped. There's

no venture capital at all. Right. Well, I think it's in some ways that forced discipline is the

best thing you can have. I mean, it's unpleasant in some ways, but I mean, when I back 10 years

ago i was working in a company called quizlet which was a top 100 website by traffic with what

was it two and a half engineers not a big budget and i think in many ways my tenure there which

was about three years what we the main thing we did is we didn't harm it we just because we i

spent so much time trying to recruit people you know we kept it free we kept it up and

you know those constraints were a good thing they were frustrating right because you always have

your long list and you're like, man, but it really does, uh, focus priorities. And I think can also

lead you away from sometimes if you have 20 engineers, it's like, well, they gotta be doing

something. And even though maybe that's not what your business needs. Yeah, you're absolutely right.

You know? So yeah, constraints sometimes are, are sort of, that's how invention happens.

For sure. Well, I, uh, I love that you're, so your code is open source. You're the ad

server you know client server uh server and client are both up there and so i was actually

prepping for this interview going through back to the first commit because i love seeing how

uh people build things and i wonder if i'll put the link in for people but essentially it's i love

actually how simple the project is even today it's essentially a single um ad server app within

Django, because it's quite easy to, I would say, bloat out a Django project. And you've been very

constrained. But I wonder if you recall, you know, building, you know, starting from the kind of that

process, right? Like, what did you start with? To the extent you can recall, like that, you know,

the changes over time, because there's a big difference between prototype to first stage

production to, you know, the scale that that you are at now. Well, you're probably looking at a

commit that's not that old. I think it was as of 2018, maybe? Yeah. So basically, when I started

at Read the Docs, the advertising was basically a Django app that was closed source, but got built

into Read the Docs at compile time. So there's sort of these private extensions in Read the Docs

that are in a private repo. The Read the Docs, the main Read the Docs repo is all public, but we have

a couple of private extensions repos. One is for the things that commercial hosting gives you,

but some other things, we have some other just closed source that are closed source for

a variety of reasons. And advertising was just one of those. It was just in a different repo.

It was just one Django app and it was closed source. And so if you looked at the first commit

of the ad server, it's probably just mostly taking a bunch of stuff from this private

it, read the docs repo, and bring it into here.

But yeah, a bunch of things were sort of renamed.

It was still pretty iterative.

Yeah.

I saw, I think, yeah, the first commit was just, you know, first commit, nothing. And then the

second commit was import ad server. But then, I mean, I could see you started with basic auth,

then at some point you added Django all auth, you know, kind of all the, to me, standard steps that

not every Django developer gets to do because often you just parachute into existing projects

and you don't have that flexibility

and you don't do it a lot from scratch.

So I always love seeing actually how

like a production site is done iteratively

because that's not an experience many people have.

Yeah, it's actually sort of a crazy experience

because I got dropped into this project

and we were basically like,

we're gonna break our ad server out from Read the Docs

because previously it was just a Django app in Read the Docs.

And, you know, it does a very large amount of traffic,

something like, you know, 30 million ads a month.

Most of those are not paid ads, but we're talking about 30 million API requests a month.

It's kind of expensive.

And then there might be additional requests on top of that.

So when we broke out the ad server, we had to build something that on day one is going to handle 30 requests a second sustained, 24 hours a day.

No pressure.

No pressure.

And yes, the first time we tried to stand it up, it absolutely fell.

Okay. What does 30 requests a second sustained look like in terms of infrastructure, in terms of the, you know, what are you running? What's, you know, how many workers are you running? How many, you know, is it using a threaded model or pre-fork? You're using Garnacorn, you're using UWSGI. How do you serve that much traffic?

It is pre-fork. We are using Gun and Corn. And it is, I think we're looking at four, it's either four or six workers. I know I've tweaked this setting, so I don't remember what it is. I could check for you. But it's either four or six or eight workers per sort of instance. And we're running, I think we're currently six instances.

Okay.

So that's about where it is. And that handles it fairly well. And right now we're hosted on Azure. We started out, we were prototyping on Heroku, but now for production, we're on Azure. That's where Read the Docs is. So we decided we'd just be on the same infrastructure.

It is set up slightly differently than how Read the Docs is set up.

Read the Docs is also on Azure, but it's basically just using like base VMs and what are called scaling sets where you can just sort of scale the number of identical VMs that you have.

So that's how Read the Docs is set up.

So Read the Docs actually auto scales.

But that's super interesting because, you know, a lot of people, you know, you'd have no idea what it takes to run a site at bigger scale, right?

You build your little thing locally, okay, fine.

You put up a worker, fine, you run it out.

How do I plan in advance if I want to grow to that kind of traffic?

Well, you need to think, okay, you're going to need half a dozen servers,

you're going to need this kind of infrastructure.

So it's really nice that you can come on and share that kind of information.

And it's much easier to build something that will handle that kind of infrastructure

than it was 10 years ago.

10 years ago, it would have been much harder.

Yeah, no, I mean, this is the cloud thing, right?

If I need six servers, I just get six servers.

It's not...

Yeah.

Click a few buttons in the AWS dashboard or whatever,

you know, the Azure dashboard.

Just drag the slider to the right,

and, you know, my bill also scales linearly.

Okay, so you're running the ad server

as a kind of a massive service on the side.

Yeah, yeah, yeah.

I mean, the big reason why we wanted to push it out of Read the Docs is that we had sort of this vision, which just sort of started to happen a couple months ago, of basically taking the ads that we've served for Read the Docs and making it so that we could help other projects, you know, other tech projects, other sort of similar places like Read the Docs that need to earn some money.

How do we help them run ads?

And we didn't want them hitting sort of readthedocs.org API endpoints.

so we sort of said hey we'll break this ad server out it'll be sort of its own thing yes it's part

of read the docs yes read the docs is like the primary user of this service but and we wanted

to break it out so it was separate infrastructure all that kind of stuff right and so this is where

it gets really exciting because it's ethical ads right yeah so i so i'm i'm putting up a site and

i'm thinking to myself oh i need to make some money but i can't bring myself to put the facebook

tracking pixels in and the Google tracking pixels in because I just can't bear to be part of the

massive surveillance capitalism world that we live in. And there's an alternative. So tell us

about the ethical side of it and why I might choose ethical. Right. So this is exactly what

I emailed Eric about. It was probably about three years ago now, almost exactly. And I had some

questions about this. How does it work? How does it work relative to something? Because I had some

familiarity with my last job. I wasn't in the marketing department in my last job. I was the

head of development. But the marketing team would come to me all the time. They needed help. And I

was sort of the liaison with the marketing department. I helped the marketing department

anytime they needed any tech stuff. And I can remember sort of like helping them set up

advertising. And I was basically horrified at everything that they were doing. I have a bit

of a background in security. Some of that extends into privacy. And basically, I was horrified.

know, sort of standard procedure in this world is take your customer list, upload it to Facebook

and tell them, Facebook, I would like to advertise to people similar to these people. That is like,

that is standard procedure in advertising, you know, for anything really, not just SaaS companies,

that's sort of anything. So that's sort of like, I was horrified at that. And so I talked to Eric

about this, what do we do differently? And there's a few different things that we do.

You know, one, we basically, as much as possible, don't set cookies.

There are some cookies that are like borderline unavoidable, like the Cloudflare cookie, if you want.

So if you want your site to be protected by Cloudflare, they sort of set a cookie and some things are sort of unavoidable.

But as much as possible, we try basically none of that.

And there's a few other aspects.

we try to like align ourselves with the site owners, not sort of against the advertisers,

but like advertisers are constantly pushing you to put more tracking. And since with read the

docs, we were sort of the publisher and we heard very much from our, from our users, like read the

docs, regular visitors, they don't want tracking. They don't want cookies. Even when I started to

read the docs, I would probably get an email a week about something privacy or security related.

You're running Google analytics. We hate that. You're, you know, whatever, something like this.

So I would get all these emails. So we were basically like, no cookies,

try to align with the site owners. Don't run any resources, nothing at all from the advertisers.

So not just scripts. You can't, you have to take the images from the advertiser and host them

yourselves. Otherwise they'll cookie your users. Um, yeah. Yeah. So just all these sorts of things

you, you start to realize like all the, I mean, cookies by themselves are, there's nothing wrong

with cookies but like you can do a lot of bad things with cookies if you if you really want to

so that's yeah all of this is it's just it it's a shady industry there's good players in this

industry there's bad players in this industry but like we we try to be one of the good guys

there you are there's can't ask for more than that i mean it's read the doc so you're dealing

with people who understand some of the implications of it um yeah you know not all developers care

about their privacy, but enough do that it's a meaningful differentiator for you.

We heard from them that they do. So, you know, a lot. I would get literally an email a week,

you know, something related to this, you know, either, you know, turn off Google Analytics or,

you know, respect do not track or, you know, something. So we took a lot of the steps there

around do not track. I don't know if you're familiar with it. It's sort of like a pseudo

standard. Um, it, it, I, it's not a real standard or rather there are real standards around it,

but like, there's no, there's no teeth around it. You can say, yeah, I support do not track.

I tell you that you're being tracked. Boom. Support. Yeah. Sounds like something, uh,

corporations would create. Absolutely. But, um, you know, the EFF sort of has their own idea of

what, of what do not track stands for. And, and there's, they sort of, if you, if you subscribe

to that, there's sort of real things that you're supposed to do, you know, keep server logs,

no more than 10 days if they contain an IP address, you know, things like this. Read the

Docs, actually, we did not originally support this, but we've moved and now we're in compliance.

With the EFF?

Yeah, with this sort of pseudo standard that the EFF has put out for Do Not Track,

both on the Read the Docs side and on the advertising side.

Yeah. Well, I mean, we can...

Sorry, go ahead, Carlton.

Well, what I'd like to ask is a kind of, if I'm a site developer, and can I ask you for your

expertise here i want to do some i don't want to put google analytics on my site and i'm not

i'm picking google analytics is the big one because of i'm concerned about these privacy

issues and i don't want to i just don't want to be part of that but i would like some analytics

i'd like to be able to you know like and i can grep my logs i guess and get some idea of how

busy my website is but what would you recommend if i'm a small um small website developer how

what can I put on that enables me to do some analytics, but ethically?

It's hard. I'm going to be 100% honest with you. It's very hard. There's this newer startup called

Plausible Analytics, and that's exactly what they bill themselves as. And some people will say,

well, that's all marketing, but it partially is and it partially isn't. They're doing some

good stuff there. So I don't want to speak negatively about them. They're doing really

good things. And having more alternatives to Google Analytics is a good thing. Not cooking

users is a good thing. People who say, oh, but just grep the server logs. Those people,

in my opinion, have not run a real business. They don't really understand what they're talking

about. You get so little from that compared with what you can get from JavaScript-based analytics.

Even if a number of users are blocking it, you still get just so much. You get things like

the time on the page, you get whether they scrolled or not. You can attach actions to

specific things like, did somebody click this button? Maybe that button doesn't trigger another

page view or something like that. You can get all this additional data. It's much easier to filter

out bots. Read the Docs traffic is like half bots. Some of those are malicious, but most of them are

just not. They're just search indexes indexing Read the Docs. And so, you know, we would have

so little data. We actually do use Google Analytics on Read the Docs still. We're always

sort of evaluating a few alternatives here. We've thought about maybe running it server-side,

where you actually hit a Read the Docs endpoint, and we basically strip out a bunch of stuff and

then send it off to Google Analytics from the server-side. And you could do things like drop

IP addresses, anonymize user

agents. You'd not have any

cookies for users, so

you wouldn't have any Google Analytics cookies for users

so Google wouldn't be able to tell

hey, this is exactly this person. So

these are advantages, but on a site like Read the Docs

you're talking about 45 million

additional server-side

API requests that then have to hit Google

while you wait for a response,

right? So there's real

drawbacks to this. So you've got to really care about your privacy.

Yeah, and you could host your own solution

but, you know,

we're talking about a service that's 45 million requests a month. Standing up something that's

going to run analytics on that is hard. It will be expensive. It will cost many hundreds of dollars

a month just in infrastructure. And we tried it a little bit and many of these solutions fell over

at that scale. So there's drawbacks. It's hard. It feels like to me, just broadly with advertising,

there's even more than usual there's just this divide between if you have a little bit of money

you can not see any ads and if you don't you're just gonna get bombarded i mean like youtube right

the last two years, YouTube went from almost no ads to now two ads on the start, ads every three

minutes. And probably like a lot of people, I don't watch regular TV really ever. And when I

do, it's for live sporting. And it's just awful because of the ads, right? Everything is a

streaming service. And which ties back to, so YouTube, they have this premium thing and they

have a free month trial and I've been using it. And it's just like, I'm almost tempted to spend

you know 12 bucks a month just to not have the ads so i guess which you know it was just to speak

if there was some global like chrome ad thing where i just never saw an ad on a chrome site

um you know that would be appealing but that's because i have some capital and i value that

time and i guess totally but i don't like that about the world i feel that's that's that shouldn't

shouldn't be the case i mean that's like capitalism to the whatever power put yourself

though into the advertisers mindset like the people who are willing to pay ten dollars a month

or twelve dollars a month or whatever it is to not see ads unfortunately those are also the most

valuable people to advertise to that there are so you can't charge a market rate for this you have

to actually charge way over a market rate for something like this uh because like the people

who will pay 12 a month to not see ads are worth more than 12 a month in advertising that's a good

point yeah it's it's it's a bad thing like i i don't know exactly like what the solution here is

i've had on my list i've had on my list to talk about a solution i've had on my list to try out

this pie hole thing where you you set up a raspberry pi to on your local network and you you set it up

as the dns resolver for your whole local for your router so that all traffic goes via it and it's

got a blacklist and it just won't load any ads or anything and i hear glowing reviews i haven't had

the you know day and a half this is your man of the people uh insight carlton well a local solution

it may not be you know it doesn't solve the you know i know a lot of people who run them so you

know they they work really well you will run into some issues where like some sites are just going

to not work and you're going to have to basically somebody's going to send you a link you're going

to see oh this doesn't resolve at all you're gonna have to log into your your pie hole and

fix something and check it out again but like you will not see ads essentially at all anywhere

and again like the advertising companies are probably unhappy about this but like

again those are the people who are the most valuable to advertise to yeah we at read the

docs we've sort of like decided you know ad blockers there's nothing you can really do about

it. People are going to ad block. You can try to sort of maybe urge them not to. You could try to

diversify your revenue stream. And that's essentially the tactic that we've taken.

You can think about this the other way too, is so many people are blocking ads. And yes,

as a result of that, the ad industry has gotten, they've had to get more intrusive, more ads,

bigger ads, et cetera. So that's sort of one response to this. But the other side of this is

like people are essentially boycotting advertising. Like, you know, it's a very wide boycott. A lot

of people, they hate it. And for a few different reasons, I think tracking is part of that

problem. I think just more intrusive ads is part of that problem. Although let me tell you,

I remember what ads were like on the internet 15 years ago. They were terrible, you know,

pop under ads, pop up ads, like all those things, you know, interstitials. I mean,

those are still there those are still there but like you know pop under ads are are literally the

bane of everything i still get probably an email a week from at the read the docs advertising email

to be like you're losing money by not running like the worst ads the internet possibly has

and and they're certainly right like could we make more money by selling out our user base

yes true yeah we could but you always have to remember you always have to remember that at

the end, your heart will be weighed against a feather, you know? Oh yeah. Yeah. So we don't do

that, but we, you know, I get an email a week probably about it. It's also that, that short

term, long-term thing where you can always, if you do any sort of analytics, it's always going

to show that you should do it because you can't measure the long-term, you can't measure the

subjective brand quality. I mean, even for someone who has ads on their site, like I was at Quizlet

top 100 website we had a new google rep every six months you know another 23 year old and every time

i was like is there actually something useful you can tell us and you know i don't blame the

rep but all they could say is a bigger ad more prominently up there and the problem is once you

put it on there you get addicted to the revenue from it and you can't take it off and then your

site looks like facebook looks like now yeah so it's a it's um it's i i admire sites that don't

have it i mean yeah i know i i understand how it happened yeah i mean you know if we put a second

ad on read the docs for example like that would probably double the revenue right right like it

and and the revenue is good but like we could probably double it by just putting a second ad

on there but we don't want to do that you know like there are drawbacks to this and you hit it

on the head like you got to think long term versus short term but i think users can't articulate that

right like sometimes people will tell me with my personal site they love the design and really what

they're saying is they love no ads yeah like i don't fool myself i'm like is it the design i

think it's that it's just there's no ads on it or there's very small ads on it but carlton you

were gonna you had a point well i was just gonna i was gonna perhaps segue back to the um the

django setup of the ad yeah yeah yeah server um because you know you must have some interesting

stories there perhaps we can go through the third party apps that you sure yeah i think i i i went

through them to re-familiarize myself with them uh for this because you know like some of them

you don't you don't work with like every single day so you don't even remember you're like what

was that for but uh yeah okay so here's one i've seen i see on the list we've got in the show notes

here is uh you've got django rate limit yeah yeah so that's a brilliant app and we should talk about

that, because I think Django doesn't come with rate limiting built into it. I know DRF has got

rate limiting on the API views, but tell us about Django rate limit. Yeah, we actually use it for

probably something different than what a lot of people use it for. A lot of people probably use

it for things like rate limiting logins or something like that. We actually use it for

rate limiting advertising. There's sort of a maximum amount of ads that anybody can click on

in any sort of amount of time. There's a maximum amount of ads that, you know, a real person could

view in a reasonable amount of time. And we sort of use it for those kinds of features. So it's,

for us, it's kind of a security feature or, you know, an ad fraud feature. Ad fraud is real. Like,

you know, I spend more than my fair share of time dealing with it. And, you know, like,

it's one of the easy things when you're just Google or Facebook, you just handle it at huge

scale. But for somebody like us, it's hard. And especially when all the advertising was run on

Read the Docs, and we have sort of an incentive to report things correctly. But when we now have

sort of these third-party publishers who were running ads on their site, that's only started,

I think I mentioned a couple months ago, and they sort of are getting paid out on this,

they have different incentives. And their incentives don't necessarily align with ours.

and we have to make sure that things are legitimate.

Yeah, okay.

You know, we basically have advertising for developers.

Developers know very well how to automate

clicking on ads or viewing ads or whatever, you know, and...

Well, yeah, you know, I've just learned AsyncIO

so now I can click on that ad, you know, concurrently.

Absolutely.

Hundreds of times a second.

well that's how you yeah i've been i've you know i've used some version of that to like we've had

to log in to like go to the local beach here and i'm just like i i almost wanted to just set up a

thing you know to slam the site i didn't know i think too too busy with kids but it's like it's

right there for you you know you can script kitty it and just it works i think i think uh what was

it django rate limit i think made the it made like the newsletter the django newsletter oh that thing

recently oh yeah i think so i think it was on there pretty recently but yeah so we use it actually

for something a little differently than that uh all off sort of has its own rate limiting sort of

built in so we we use that for authentication but that that's not as big of and we use that both on

our ad server and on read the docs all off that is but yeah rate limit is specifically for sort

of manually rate limiting advertising as a as an ad fraud feature can i ask so there's a number of

So you're an international platform, and there's a number of packages around, you know, country-specific things.

I guess broadly, can you talk about the challenges of moving beyond just being U.S.-based and a global supportive Django project?

Because that's a whole other thing.

Got it.

Yeah, it's hard, and I actually almost don't want to talk about it too much because I don't think that Read the Docs spends, like, an appropriate amount of effort there.

You'd be surprised, but something like 92% of all documentation on Read the Docs is just English.

What percentage is the U.S.?

Oh, you mean traffic-wise.

So traffic is totally different.

Yeah, there's internationalization of language and then there's traffic.

Yeah, so traffic-wise, it's about a quarter North America, U.S. and Canada.

So like 21%, 22% U.S. is sort of the U.S. percentage.

But like in terms of language, like written spoken language, it's 92% English and virtually all of the remainder is Chinese.

So everything else is a rounding error, sub 1%.

But like the support for multi-language sites, right?

If my docs are translated, I can serve them in English and in German and in French.

But the effort to do that is just monumental.

We do it on Django, but we have like literally a whole team of people who, you know, and volunteers who do the translation on each language.

And it's just the amount of effort for a solo, you know, for a smaller project like, you know, my Django filter.

There's no way I could translate the docs for Django filter.

It just couldn't happen.

Yeah.

And this is actually an area.

So Anthony, this is an area of interest of him.

One of my coworkers at Read the Docs.

He really, he like, this is an area that's definitely one of his interests.

And he has a few ideas here, like integrating with some of these third-party translation services, like Transfx or something like that, where you can maybe, you know, Sphinx supports something similar to what Django supports, where you have these sort of .po files that you can upload to a service and hopefully get a translation for them and then serve them.

It is a little tricky to set up a multi-language project on Read the Docs, but it is possible.

We have a few projects doing it.

Probably the biggest one is Godot, which is like a game engine, a C++ game engine.

They're a huge project on Read the Docs, totally not in the Python community.

Until you start looking at how does Read the Docs make money and what are our biggest traffic projects,

I would have not been familiar with them at all because, you know, it's not in the Python

community. It's not something that I work with on a daily basis, but they have translations for

probably a dozen languages for their documentation. Okay. They are also a very, they are very

expensive to host because their documentation builds take like 20 minutes each. And so it's

like, oh, we committed to the main repository. Now we have to build documentation for 20 languages.

Well, that's one thing I wanted to ask is how clever is the caching on the docs rebuild?

Like, do you kind of check sums up front on the docs folder or that kind of thing to say,

hey, no, do you know what, whilst there's been a commit, nothing's changed, we don't

have to rebuild?

We used to and we removed most of them.

And the reason why is because it's actually extremely hard to do it correctly, because

a lot of times Sphinx projects will use some auto API or something like this that's

referencing code.

So it's actually building,

it's building docs from code directly.

And so it might be a change that isn't in the docs

directory, but because of a change in the code directory,

it affects the output of the docs.

dock so we we just decided at some point this is too hard we'll just waste resources and have our

builds be correct so you actually you endanger using more energy trying to guess whether you

should rebuild them whether you would you know actually yeah yeah it ended up being just sort

of like a problem that we decided was was too hard um to solve and interestingly enough i think a lot

of the CI services went the same route. So we, we sort of model a lot of what we do off of some of

these CI services like Travis or, or CircleCI. And that's, that's what they're doing. You know,

they're, they're not trying to get clever and say, oh, well, we think the tests are going to pass

because you didn't change anything over here. They just rerun it. Wow. Okay. Okay. Interesting.

I really want to ask about Stripe, but maybe that just, it works brilliantly for you and the new

api has been no big deal switching over um we're actually not using most of the new apis we're

actually using it to pay out publishers it's sort of a little bit different connect or not so not

yeah connect connect yeah yeah so basically publishers can sign up for a stripe account

we don't get their bank account information but we can basically transfer money to them so that

that's what we're using it for probably most of our payouts are still paypal though and we don't

have like an a fully automated solution there yet this is one of the things when you stand up your

own advertising network over like a couple months lots of things are not automated yet well even um

carbon ads is a paypal i thought it would be a little more advanced but it's not i think a lot

of the new stripe stuff too is around um subscription and um blanking on the european

law but there's all sorts of things around that which is so it's less so for a sort of one-offs

though stripe is finally adding i think i think they just added in like tax support because for

a normal like someone my size if i added stripe it's like to collect tax per state per country is

impossible to do even if i tried to which i have it's just like well i guess it's a no-go

but stripe is slowly rolling up all these things around analytics you know around tax jar and all

these other third-party things you can sub in to do taxes appropriately but we're actually not

using it for subscriptions which i think is like a lot of what yeah i guess some of the new stuff is

yeah that causes a lot yeah we use it mostly for you know sending invoices to advertisers and paying

out publishers those are sort of our main things and the the stripe connect stuff has worked fairly

well although there's sort of like a beta for like if you want if you have your balance in stripe in

u.s dollars and you want to pay somebody whose balance is not in u.s dollars or actually even

just not in the U S it's a problem. Um, Stripe has beta support for it. So we are like applied

to join the beta. So right now we can only pay out publishers via Stripe in the U S, but maybe

that'll change in a month. Everybody else has to use PayPal or something else, or they have to give

us their bank information, which is as much as possible. We want to not do that. Well, that's

like my mom there was something not to pick on my mother but as an example of a you know she

wanted to write a check as opposed to enter her credit card and i was like you know your check

has your routing and your account number on it right like you know that and your address yeah

checks are way less secure oh my god they're amazingly insecure it's just oh my god handing

that to a stranger when the things i could do if someone gave me a check but anyways so uh while

Well, DjangoCon US has been in San Diego the last two years, will be again next year.

Can you just briefly talk about, you wrote the bid for that, right?

I wrote the bid for that.

So I'll combine this with sort of the San Diego Python stuff.

So we have a couple, there's a couple of people who are maybe bigger, they have a bigger sort

of maybe name in the Python community than I do.

I really focus my efforts locally.

I'm not big on social media.

I have very little social media presence.

But Trey Hunter, who's one of the other sort of San Diego Python organizers, he was basically like, you, you should, David, you should write the bid for DjangoCon US to come to San Diego.

I was like, all right, fine, Trey, I'll do this.

And so I wrote it.

It was pretty convenient, actually, because at the time I was working in this office in downtown San Diego.

And actually the San Diego Tourism Authority, which is like this quasi government, well, actually they are governmental.

They're part of the local government.

They were in the same building.

So I just sort of stopped by their office and was like, I have this conference, you know, it's going to, it's, it's this big, it's going to probably bring in this many dollars to San Diego. I need help writing a bid. I wrote the whole bid, but like they pointed me in the right direction so much, you know, Django, Django con was basically like, you know, here's, here's like our target budgets for, for hotels. Here's like our target budgets for some other things.

And I brought that to the San Diego Tourism Authority and they're like, this is perfect.

These whole hotels in this area, you can't even talk to them.

They helped me so much because I would have spent so much time talking to the wrong hotels that are just double the budget and stuff like that.

So, yeah, it helped a lot.

I ended up writing the bid.

I think there were a couple other bids, but we got accepted.

And I'm sure it's probably more expensive than some of the other places they've hosted, but it's certainly not Bay Area prices or anything like that.

No, for sure.

mean, San Diego is a lovely place to visit. I mean, I'm often, I've only recently come to

appreciate all the work that Defna and you, Jeff Triplett and others there do to organize these

conferences. Cause I mean, DjangoCon Europe just happened and it's so much work to do a conference

and it's, um, you know, very much in the, you know, not, not publicly seen. So it's interesting

just to hear about what it takes to put that on. Yeah. The bid is relatively minor compared to like

the operations and all the other things so really like jeff triplet and and i don't want to just

call that one person you know that whole team yeah is really you know they they're what make

django con a success django con us anyway um so i my my contribution is like so minor by comparison

but yeah so san diego python getting back to that yeah you know i'm i'm sort of one of the

co-organizers. I sort of took over as probably like the main organizer in maybe 2012. The person

who was the organizer before moved to the Bay Area, which is a common problem in San Diego for

organizers. You know, they just sort of like make a name for themselves or, you know, take the next

step in their career. And the logical next step is move to the Bay Area and make way more money.

So I was the person who, I'm never leaving San Diego. I love it here. It's great. So I was sort

of the person who ran it by default because I was not moving away. But now we have this great team,

you know, like that, that's one of the big, one of my big successes, in my opinion, is that

now there's a bunch of other organizers. And when I need a month off or like something else like

that, Django or San Diego Python continues without me. My, my, my wife and I had a daughter

four years ago and I sort of didn't do any San Diego Python stuff for a year and it worked.

like people still went and there were still talks and you know that's sort of a that's sort of the

other the other organizers really made that happen so i'd say that's my biggest success is that the

group will continue and i'm gone well i mean in any organization i mean here in um boston there's

there's a django boston group which i think is the third largest by members um after we modeled

ours after that one oh okay yeah yeah so it's um uh john baldwin and i'm i'm sorry i'm blanking on

the other. Um, but it's, you know, it's, it's one or two people who do, you know, all the work and

it's quite a lot of work. Um, but it's such a great, uh, attribute for the community. Um, and

even, I guess, even as big as Boston is, which is a pretty big developer community, you know,

we've had, we used to have 80, you know, 50 to 80 people. Um, the Python meetings of course are

like three, 400, but, um, it's a great, that's way bigger than us, but yeah. Okay. Well, well,

Well, but I would say, again, mindful of time, but in Boston, web is not that big a thing.

It's so much more data science or even hardware stuff.

There's not as much web stuff, whereas when I was in the Bay Area, it was very much consumer web kind of things.

There's much more of a, I don't know, PhD-level programming, and the web is sort of this thing you sprinkle on top to deploy it to customers.

sure carlton what's the what's the scene on the coast of spain well where i am there's there's me

um there's a few little local web dev shops um there's not much here they're down in barcelona

there's a good amount there's a python barcelona meet up there which you know is good i'm really

bad because i've got so many children i just don't go i'm like yeah no i could i could take

out this massive chunk of my life to go down to barcelona and then hang out till quite late at

night and then drive home would be totally i can't do that so i don't go down very often

um but they're a great crowd and you know it's active and they still keep going now i think

they've got a an online round table tomorrow discussing you know the changes in the remote

working and all the rest because they brought in new laws in spain yet like yesterday about remote

working and how that's going to affect everything so it's you know it's a really active community i

mean barcelona's yeah there's quite a good tech scene there i would love to ask you more questions

but i think we're we're close to time um we're gonna have links to the the code i definitely

recommend people take a look at the ethical ad server especially if you look at the models.py

file that alone is one of the rare readable production level code snippets i've seen

well commented and everything else so i definitely recommend that ethicalads.io right that's the site

if people want to have ethical ads on their site yeah um anything else as we head out you want to

promoter shout out uh you know no just i think san diego python is sort of like that is one of

my biggest successes i feel so i'm i'm happy that we talked about that i'm happy we were able to fit

that in read the docs is fantastic i love them too and and you know i'm really happy i mostly

i read the docs work on the advertising side so um you know i i work sometimes on like the security

and privacy stuff on read the docs as well uh probably my big my biggest success there and

maybe i'll shout out to cloudflare thank you for that but like all the stuff where we now have

https on the thousands of custom domains for docs on read the docs that's all like courtesy of

cloudflare that would cost us like thousands of dollars a month it would probably that alone

would double our infrastructure budget if we were paying like retail price for that

so i'll thank them for that i mean i use cloudflare i feel like there should be a whole

course on Django plus Cloudflare

since it's so powerful.

Yeah.

Read the Docs is actually doing

some really cool stuff there,

but maybe there's not time for that.

We have this whole thing

where when there's a new docs build,

we're purging from the cache

just those docs

and all those sorts of things.

When we rolled that out,

docs, especially if you were browsing them

in Asia or Europe,

they got so much faster,

especially Asia

because it's far from our main data center.

Well, maybe we'll have to have you on again.

I mean, yeah, just before this,

I was manually purging some pages on LearnDjango

that I updated and I'm like,

yeah, I know I need to automate this,

but I just can't be bothered.

Yeah.

So I can't even imagine at the scale that you all are at.

Oh, yeah.

And yeah, I won't go into it

because I know we're at time.

But yeah, it's super cool.

I didn't work on that.

So I don't want to take credit for that.

That was mostly Eric Holscher.

But the long and short of it

is you don't go through the UI there

clicking perch perch page by page copy paste each url oh absolutely not no well thank you so much

for coming on david um we'll have links to everything in the show notes um really appreciate

it i've been and glad uh jeff jeff connected us yeah jeff triplet because um i was saying uh i

was complaining about ads and he was like you should use ethical ads and i was like what's that

and then he's like and there's someone there you can email i was like oh okay so here we are yeah

Yeah. Advertising, it's a crazy business. Lots of bad stuff, but we're trying to do good there

as much as one can. So as ever, we are at DjangoChat.com,

ChatDjango on Twitter, and we'll see you all at the next episode. Bye-bye.

Join us next time. Bye-bye.

Thank you.