Transcript: Authentication - José Padilla
Hello, good morning. Welcome to another episode of Django Chat.
I'm Carlton Gibson and I have with me my friend Will Vincent. Hello, Will.
Hello, Carlton.
How are you? Today we have a very special guest, Jose Badia. Hello, Jose, how are you?
Hello, how are you?
See, muy bien, muy bien. Let's do the rest in English.
Jose, we're thrilled to have you on. You and I have met a couple of times, and I think I first became aware of you through your work with JWT, so Django REST Framework.
So what's the, do you have a quick background on, obviously Carlton and I know about your work in Django, but what's the quick take on your involvement in the Django community?
So I spent some time building a product called Blimp.
We were a small team building a project management software.
And I remember our first MVP was actually built in Flask and like MongoDB.
And after failing doing that, right, we switched back to Django.
It was like probably like Django 1.2.
And so I spent at least like around five years working with Python and Django.
So I had a chance to kind of help maintain Django's framework
and kind of pick up helping maintaining other libraries like PyJWT.
Yeah, and like through that, I've worked on random other bits and pieces.
And by far, Python keeps being my favorite community.
Yeah, well, we'll link in the show notes.
But I remember seeing you gave, was it 2014 to DjangoCon?
You gave a talk on JWTs, which I think was your first time at a Django conference.
Is that right?
That's right, yeah.
And I think that went well.
I didn't actually give that talk anywhere else, probably like in another Python meetup, but not in a conference.
What I most liked about that talk was how, at that time, visually explaining what a JSON word token is.
I think I did a good job at that.
I'm kind of tying that up to authentication with like Django Rest Framework and Django.
You know, I have to say, Jose, because you've been you were working on Django Rest Framework for just ages and ages and years.
We were there and you were an absolute machine maintaining the framework, you know, every every I mean, maybe you're doing it part of your work, you know, so you had the time, but you were there, you know, making the fixes, doing the updates, you know, absolute rock for years.
thank you thank you that means a lot coming from you definitely um and i don't like looking back i
don't i think what i um contributed the most at that time was kind of the story around third-party
extensions um and you know how that story fits into django's framework and django and you know
the whole uh python um community as a whole um but yeah it was like it was definitely not
part of my job um this is a free time kind of thing okay um but but i definitely was glad i
was able to to help out with with things like that yeah no and you you would you know absolute
stalwart but that's an interesting point you made about the third-party extensions because
it's something that in django rest framework we were really um keen on and still are keen on
Because we have very limited time and limited bandwidth to maintain stuff.
And so someone comes along with a new feature idea.
And it's like, well, first of all, can we put that into a third-party package that that contributor can maintain themselves?
Because that keeps the call maintainable, right?
Yep.
So what was Blimp?
What was the quick take on what that service was?
So, we were kind of three people working on that project, and we got together one day, and we were just thinking of, like, things we could possibly build a product around.
We had all worked in kind of digital agencies and marketing firms and things like that, where mostly they were using Basecamp, like, first version of Basecamp at that point.
So we had, like, really strong opinions about how project management should look like and should work like, especially in that creative environment, small teams.
So we had, we proposed a structured kind of process through product management.
So we had a really nice, like, niche, you know, we were targeting small teams like ours, three, four, five teams.
people, um, most work was around the actual tasks. Um, so we, you know, Basecamp had, um,
messaging and people would just do product management via messaging, which is via email.
Um, so we, we kind of forced you to use our proposed flow. Um, and then, you know, at some
point trello came up asana came up um and you know they were way bigger than us um and eventually
like it it made sense for us to just shut it down and focus on other things right okay what was the
run there because like that was you know five year period you say yeah like at least five six
that's a good run right for a startup that you've you know small team that you sell yeah definitely
running your own show we didn't take money it was completely bootstrapped um we you know got
together and made it happen um we learned a lot through the way you know the whole period um and
there were definitely like really good opportunities for other things to pop up um one of them being uh
this side project i still maintain called fall previews yeah i wanted to ask about or you and
i have spoken about it but i think it's a really interesting project so what's the quick take on
it and then we can dive into the tech stack because that's that's interesting so so we we
were building at Blimp, we had a file section and you could upload files, share them, and you could
upload revisions of files. And once we designed the whole thing, we noticed that we were kind of
missing pretty thumbnails for files. So if you upload a JPEG or a PNG, that's easy. If you upload
a Word document, that's a little bit harder. If you upload a Photoshop file or Illustrator file,
becomes harder um so we we kind of built this um pipeline where you could upload basically any kind
of file and we were we would just output some uh pngs um and that was harder than it sounds
um so we like from the first um from the get-go we kind of built that separate to blimp as a
separate service um and we like after we built it and we were using it we noticed that we could just
put a landing page on it put a pay button in it and other people might buy it
and surprisingly it outlived blimp as a product um it's been going on um and i kind of took over it
um after we decided to shut down the product that's great because that's like the original
microservice right it's yeah right five years before the the book five years before the the
meme it's like yeah it's a microservice and so what's the what's the tech stack or how much can
you tell us about the architecture of it um so yeah so it's it's built with django obviously um
it it does have a simple api so that's django's framework um you know we use celery um pretty
extensively um we are using redis as our broker postgres as our database um we host our workers
on pretty uh hefty ec2 instances um and our api is actually still in heroku um so yeah it's it's
it gives me an opportunity i don't work with python on my day-to-day job um for a while now
And it gives you an opportunity to actually be in tuned with, you know, what's going on with Python and Django.
But it also gives me a chance to kind of polish my scaling skills.
So we have, you know, this is pretty CPU intensive work and we have pretty interesting traffic patterns.
So, you know, I have to think about how we scale this, how we improve our availability, how, you know, we make sure we have monitoring and observability up to par.
Yeah, because I can imagine that you could and maybe do get a big client just comes in and crushes you with.
I mean, I guess so how does the traffic look?
Is that common that out of the blue you'll just get nailed with a lot of stuff and then otherwise it's flat or is there any consistency there?
So, yeah, so we, you know, FilePreviews has customers around the world and different kind of traffic patterns.
So we have people doing like bulk imports of just files that they have.
So sometimes we'll get, you know, 5,000 requests one after the other to generate previews for different kinds of files.
And, you know, I guess part of the challenge is, you know, you'd handle that with scaling your instances.
So you have more instances and you can kind of, you know, process the queue faster.
But things like larger files.
So, you know, we have customers that sometimes upload a PDF file that is like 20,000 pixels by 30,000 pixels, right?
and it's kind of like a pd it's like a giant pdf um so you know those are one of those could be
you know could hold up the queue for a longer time um so like you know we need to kind of be
observing our limits time limits memory limits view limits um file size limits um would this
be a good example of a use case for serverless like you know a lambda or a functions-based
approach or you know because that kind of looks like the perfect example right you've got
discrete jobs that can be done one by one does it would it work in that environment or
um so maybe so yes so i'm hoping to be able to try that um lambda like aws lambda um they have
some limits to the you know the size of of what you can actually install there um so you know we
even though our workers run
in a Docker container, you could
imagine that Docker container is just like
a VM. So
it'll be interesting to kind of split that
into
to be able to just fit into whatever
the constraint for Lambda is. But it
should definitely be possible.
At this point, you know,
this is kind of my side project and I haven't
had the time.
I tried
porting a specific
part of the
pipeline. And it just was so like an additional complexity, like with testing, and just kind
how we would compile different dependencies that would fit into that particular environment.
But yeah, it definitely seems like the thing that you could probably kind of model in that
serverless like architecture.
Okay, interesting.
But it's not, you know, this two plus one, yes, but the other no, is that hang on, you
know, the VM doesn't give me a lot here.
I've got this whole structure.
Yeah, we like I spent a lot of time kind of tuning, or, you know,
we have to a particular infrastructure model and that has solved our immediate scaling issues
so I am not you know touching that much of it now. Can I ask currently you're working at Auth0
is that right? Yes I do I started working at Auth0 as an engineer in August. Okay so in an
earlier episode we were talking about authentication and I sort of said you know I gave Auth0 a go
and it didn't really fit in with Django.
And so, you know, I probably wouldn't use it.
And I don't know, what's the use case?
So here, can I ask you to put your Auth0 hat on
and tell us the Auth0 story for Django?
Yeah, you're wearing the sweatshirt.
So if I'm going to use Auth0 in my Django project,
or I'm going to consider it,
can you just give us the sales pitch?
What does it give me as a Django developer?
Okay, so kind of going back to that episode
where, you know, you were talking about authentication
and authorization and you kind of defined those and i think you did a very good job um and i guess
one question i as a developer of like in the small side projects using django i i often question
myself like why would i use auth0 versus something like django all off right um and if we kind of
think about django's um batteries included philosophy you know django does a really great
work at not letting developers screw up authentication right we there's a configurable
password hatching system with really great defaults so you don't have to think about
using bitcrypt or do i use like md5 or you know it's really hard to even store like plain text
passwords um and then there's this other like whole other things that third-party third-party
packages kind of help with like password strength checks and throttling authenticating against like
third-party identity providers like facebook or google but you the developer you still need to
think about those security related things you need to kind of know that you're looking for
throttling of your login endpoints and you need to install a package after you find it you need
configure you need to maintain that and whatever other um kind of dependencies that that might have
you might need to have a redis server or something else right um so off zero provides authentication
and authorization as a service we give developers and companies building blocks to secure their
applications without having to become security experts um you know we provide unified device
synopsis login experience so it doesn't matter if you're working on a web app or a mobile app you
can still provide this unified login experience that looks like one single thing right um we
provide like many social identity providers um you can add custom ones to brute force protection
breach password detection um you know single sign-on passwordless login multi-factor auth
user management and you know these are all the things that you could probably find if something
like Django or Django all app can't provide you can go and find possibly a third-party package
that somebody built you then need to actually assess if they're you know well maintained if
they're well built if they're secure and then we go back to that you know security expert thing
um um off to also does something really cool um which they allow you to write custom javascript
code to customize any
part of the authentication and authorization
flow. So, for example, if you wanted
to enrich
user profiles from data from
Clearbit, if you wanted
to reuse information from
another database
or service, if you wanted
to, I don't know, notify another system
of real-time logins, you can do that.
So, yeah.
So, it goes back to
Django provides you
login out of the box and you you know you could figure out the whole template situation and
so yourself and you know going back to like one of the recent episodes we were talking about patterns
um and you after you're built you know one two apps that have login you'll notice the patterns
and you'll maybe have your own project templates and you know um but then this other
like little thousands of other things around authentication, authorization, and security,
you either need to be aware of, be proactive, or you'll have to be free active after a possible
incident. So either, you know, you kind of delegate all those really important security
and trust aspects of your application to somebody like Auth0, or, you know, you take that on,
and you need to be aware of you know what taking that on means okay there's so that that's a super
answer and so what um one thing you mentioned at the beginning of that so two thoughts that came
up one one was you you mentioned about the side project versus the big you know it's a side
project yeah totally i can see why you take it on but all zero surely pitches itself for big
projects too right oh yeah it's not just for a developer site so yeah so so it's it's not only
for you know somebody building a web app and a mobile app um we are a identity platform right so
if you're building a single sign-on for your whole like internal infrastructure services on a large
enterprise setting Auth0 would still you know work for you fine and okay and then when the second
thought was when I tried this a long time ago is when Auth0 first came out and I gave it a go and
i gave it it's been i didn't really find that it integrated with django in the way i wanted to
in that it would it would do the jwt client-side authentication i could see how that would give
me the mobile app authentication that was all super but what i really wanted was in my view
request.user to return a an instance so is there now a nice story about integrating with the django
project we'd like remote user remote user back end um so we we have like one of the
things that attracted me to Auth0 a long time ago
was their content as a marketing effort.
And, you know, they have really great guides.
So there's, I was just checking out
the getting started guide with Django.
And since Auth0 actually provides an auth,
sorry, an auth2 API,
you can use something like Django Social Auth or
Django All Auth, and you would use Auth0
as an identity, right? And then you would handle
in Auth0, you would handle Facebook login.
So I, you know, part of the
QuickShare guide I was looking at does use remote user backend
on the remote user middleware.
You know, it really fits into the whole
django lingo okay that sounds that sounds super so i think it can you um in the show notes we
can put a link to those getting started guys with django yeah well and i wanted to ask you too so i
i saw on twitter i think just yesterday there was just um the north bay python conference and uh
guido uh tweeted this thing about um jacob kaplan moss uh django co-creator saying stop using
passwords use login with google facebook um quote google security team is better than yours
i have some i'm honestly my jaw dropped when i saw that
but i'm curious what your thoughts are as much more of a expert in this realm than i am
i just don't even know where they're coming from with that uh you know i mean facebook was storing
stuff in pure text uh i mean i can see auth zero but to just hand it over to you know one of the
couple monopolies to me i just um i don't know there's two parts i'm i feel like i'm missing
something big here if this is just a side weekend project then you know absolutely the more you can
outsource the best but if this is if you know if your user database is a core strategic asset of
your business then you can't just hand it over to a third party so there's that there's that point
which is important and yeah our facebook that's a claw because every time i turn on the news it's
it's so crufty and creepy i mean google's not much better i mean that's i actually i mean so
with uh jingle all off which is fantastic i i think i still have the you know top google spot
for a tutorial on that and i actually deliberately don't i think there's one i maybe i showed gmail
but i don't show facebook because well here i go never working at facebook i think facebook is
pretty awful so i don't feel like doing their job for them um but we gotta find a little bit
of a rant so so make the case for auth zero as opposed to like google facebook if someone just
says well i'm just delegating my auth in either case why is that actually incorrect that's
interesting so so like i think one thing to notice is that you can use all auth i mean sorry
auth0 without um integrating any like social identity providers right you don't need to use
facebook or google or you know the other 300 providers that we might have um you can use like
email and password against the auth0 database right um yeah it's like trust is the name of the
game. It's especially when you're handling such an important path for an application as
authentication is and as authorization is. Yeah, I know. I mean, it's a deep topic. I mean,
one part of this, which I've seen with some services recently, is this login by email where
there is no password. Every time I log in, I just put in my email address and I get a
one-time link. You know, I can see how that's more secure. It's maybe a little annoying,
but i'm seeing that more and more that's um but as well you've got to bear in mind as well that
a lot of people struggle with passwords um and if you can send them a link where they get a button
and they can click it that's or you know tap it these days that's actually a much better user flow
so it's it's arguably more secure probably is and you know assuming the email's not compromised but
the email's the sort of golden key to all of this isn't it um yeah so so so that's like passwordless
login. And that's something that Auth0 allows you to do pretty easily. And, you know, since we also provide, like, MFA, so you can do, like, multi-factor authentication, you can use, like, the Google Authenticator or Authy or...
If you have to, or you can use SMS.
And I think now that we're starting to see 2FA become like something normal,
I wouldn't say it's like, I can't get my parents to use 2FA.
And so then you can't make that a hard requirement, right?
So I can see like passwordless login, not depending on your targets.
As a UX standpoint, it might actually hurt your user experience.
I can't imagine my mom putting in her email, if you remember, switch email to use.
And then, why do I have to go to my email?
And I found the email.
Now, what?
Oh, there's a link.
So again, it depends on your target.
i can see now that you know there's a lot of players kind of working to improve the whole
user story for security like if you think about security keys like hardware keys um
passwords a lot passwordless login um mfa and things like that um so i'm hoping that in the
you know coming years we'll have a better story for all of this i think django needs a better
story as well we don't have the two-factor auth built in yet and we don't even it's not we haven't
yet got the even like the hooks where you would add that really there are a couple of third-party
solutions but they're not you know they need to be brought into django i think it's part of the
batteries that now that you need to be able to plug into two-factor yeah i i recently um so i
love the you know django admin and i never turn it off um so for file previews i've had to kind
of worked around things like pagination because like you know at our rates most things don't work
in the django admin but like once you get through that you know hurdle it's like okay so now how do
i hide the admin login um because i already have like this whole other login flow that i want to
use and and kind of funnel all traffic if you know if it's for our regular users or for admin
support users users um and yeah that's it's been a little bit clunky on that side the way the way
i do that which i don't quite understand that i was just gonna ask are you giving admin access to
some users can see their their previews in there it sounds like there's three levels no it's just
just me at this point okay so so what i do in that in that um situation is i use nginx to lock
it down to localhost and then just ssh into the box so that it's and connect via a tunnel yeah
and then if you're using heroku that becomes like a harder oh yeah okay okay but like the
reason i do that is because then it's not exposed on a you know url and yep and like you know most
logins out there don't even have uh freight limiting or throttling so you can do like
credential stuffing attacks and you would probably you don't even notice yeah yeah well this whole i
mean i wonder just from the for auth zero from the marketing aspect you know all of this stuff
sounds the more you learn about the web the more you go oh my god how scary it is but it's you know
but a beginner just you know wants it to work and has trouble with that so i i wonder i maybe
at what point does someone go wow i really need to switch over to auth zero is it they already
get burned or they're at a big company where the login is complicated um i i just sort of
it feels like you almost need to get burned or be in a large enough complicated enough
organization before you go oh wow maybe i want some help with this yep that's that's it's i think
right now it's, it's kind of, you know, reactive, you've had a bad experience, you've had a leak,
you've had, you know, security vulnerabilities being reported to you. Um, and I think one of
the other like great things that Auth0 does well is again, this is probably a marketing strategy,
is putting out content that is relevant on security topics and best practices
and kind of like teaching the general audience about security
and hoping that we're going to start seeing people be more proactive about this
and, you know, not having to wait
to actually be on a bad position after an incident.
And then, you know, having to figure out
maybe we should have used an identity provider
and, you know, three years in with millions of users,
that might be a harder thing, but still possible.
Right. No, I think they have the fantastic guides on,
I remember just JWTs when I was learning about that.
And I mean, it's similar to what DigitalOcean has done,
I think, to differentiate themselves.
They have fantastic content because, you know, spinning stuff up is scary for beginners.
Though it does seem to me just from a business aspect that at the end of the day, you know, beginners are kind of whatever.
It's all about big clients.
I mean, this is what I have to keep reminding myself because I'm, I mean, even Heroku, like I have Heroku in my books.
And I mean, almost every day I get some sort of question around Heroku or SendGrid, you know, Twilio.
And I'm just like, why can't they be bothered to make this, you know, friendly?
And it's because they don't, I think it's because they don't really care.
Right. I mean, a couple newcomers isn't the same as having really good, you know, docs as opposed to tutorials that an experienced developer at a large company can just dive in and use.
Yeah, that's the thesis I have now. I don't know.
I can't speak to, you know, BotZero or Twilio or Genelec and stuff like that.
But having worked on file previews, I, you know, after some time, I decided to cater to a specific target of users.
mostly because i don't have the bandwidth to deal with a lot of like um incoming like
incoming beginner level leads right at the end of the day they'll they'll unfortunately they'll
require um a little bit more time um and kind of like not not hand-holding but you know explaining
what a webhook is and and things like that um and in my experience kind of converting that
you know all that invested time and resources into converting that specific user to a paying
customer you know the roi is way less um yeah so i'm kind of vague in some parts um i'm kind of
vague in like some of the pricing um vague in kind of like some of the examples that we put out
um and for me that has worked um so i can't definitely speak to to like i think that's a
standard story from sas businesses right it's the the big the few big enterprise clients that the
one where you're off the end of the pricing thing and it's call us for a quote that's where all the
profit comes from yeah yeah well i mean i'm slow coming to this realization but it sort of makes
sense because i've always i've always wondered um well so what so what is the stack at at zero
because you mentioned you're not using python there what what sort of technologies are you
using day to day so it's mostly it's mostly um node.js um like from the ground up main core
services have been built in node.js and we've you know we have tons of people that are very
experienced in how to make node scale um there are some services that are built with like java
and go there's some python i think but it's like for some like ops related things um yeah ask me
switch over to async just do a total rewrite no big deal yeah i'm currently um so when i joined
i joined the the platform domain so we are like the you know backbone of trust for you know our
customers, but we kind of enable other developers in Auth0
to be able to be as
productive as they can be. So I'm working
on our untrusted code execution platform.
So we basically all this code
that our customers can write in JavaScript to extend
different flows of authentication and authorization,
we basically execute their code
and inject the results in different parts of the load.
You know, that's a great problem, though,
because, hey, give me some code.
I'm not going to know what it is,
and I'm going to run it on my computer.
Like, that's a tricky problem, no?
Yeah, doing that at scale seems hard,
but I'm really excited.
it's i it's i've been here for i don't know a couple months now and i've learned so much from
like so many smart people um and you know coming from a small kind of consulting and small products
um the experience i've kind of gained you know doing things at this scale is amazing
and that's you know kind of why i wanted to switch after all these years you know
So you need to enter up in your thing.
Yeah, that's great.
Well, and this may be an ignorant question,
but so it's Node, so are they using,
you know, for the web stuff, is it Express?
I mean, I know the web part is just a teeny piece of it,
but what is, beyond Node, what is the stack?
I mean, I know it's hard to define in such a big organization.
There's some Express and there's some Happy, HappyJS.
And Happy is a kind of batteries-included web framework,
Or API framework, right?
It provides things like JOY for the validation
and the back end, the authentication back end.
Yeah.
You don't have to go looking for them in micropackages.
It's definitely a little more than Express,
way less than Django.
Yeah, and that's...
But I think that there's work on standardization
around, you know, which one should we use when.
um that's all i know well i know that um uh brave browser which actually i use which is
brendan ike um involved with yeah they're big on the the happy train yep and a pretty good stamp of
a off zero also um actually uh they sponsor a lot of open source projects um around authentication
identity you know um so before i even joined off zero they reached out to me one day and they
asked me if I wanted them to sponsor the
PyJWC library. And that's amazing. And I
learned that they do that with many other libraries.
So they're actually, you know, we use
these libraries internally. And even if we don't use them, the fact that there are other
developers working on this,
they care about that.
Yeah, well, I like that they're doing the positive education approach as opposed
to the fear approach.
because fears you can't fear something you don't understand because certainly you know as a parent
i look at all the ads they're all like your kid might die or you're a bad parent like they're
pretty unsubtle about it for you know a car seat or a car um so i appreciate them having that
approach i did want to ask uh we're coming up on time um i think you just gave a talk at pie gotham
right on which will is that out yet can we link to that yes i can definitely get that to the show
notes um so yeah can you talk about that so yeah so i it's it's it's called python government and
contracts um it's it's a talk i gave um i think september at py gotham and actually my partner
in crime he gave a version of that talk at north bay python um oh okay that recording is also out
Um, so, you know, after Jurgen Maria passed through Puerto Rico, um, we, you know, I was
not in Puerto Rico at that time.
Um, and, you know, I spent weeks, if not months without hearing from my parents, um, you know,
due to infrastructure issues, you know, that power was out, cell phone antennas were out,
um, water, you know, was out.
um and so that kind of leads to some uh kind of questionable contracts that were awarded to
companies to kind of you know supposedly rebuild infrastructure in particular and kind of help
after the destruction um the fema was involved in some of that um and that leads to um i kind
of learning about the so the office of the controller in Puerto Rico they put out the
contracts that government agencies award to other contractors and kind of sometime after that I
had some questions as a civilian I am you know I am not a investigative journalist or anything
like that but i i did have some questions about you know how money was being spent in particular
agencies um and i kind of wanted to see you know relations between you know different contractors
doing business as other you know companies um and so i basically built a project um together
where we kind of took the data,
we kind of pull and downloaded all this data
from the office of the controller's website.
We downloaded the actual PDF documents for the contracts
and we're kind of extracting text
from kind of things that I've learned doing fault reviews.
We're extracting that text, we're indexing that data.
So now it can be searched and correlated.
So this is kind of like a, you know, civic hacking project that I'm very passionate about.
I've been working on it for on and off for almost a year.
And finally, this year, we decided to talk about it because we thought it would be interesting story about, you know, kind of how we went about it.
This is open source. It's built Django, Django's framework, Postgres, our front end is like Next.js.
And, yeah, so I'm definitely very excited to keep talking about this.
We're at a point where we're, you know, talking about how we can be more transparent about, you know, how this is financially sustainable.
It does cost some money to run, not that much.
So, you know, we want to be transparent about that and see if we need help from, you know, others.
and it i'm finally like excited to see where this goes um you know i want to empower other people
like me to kind of know what questions to make and kind of be able to visualize and at the end
of the day just hold accountable whoever needs to be held accountable these kind of things you
always see this you know you know billions of dollars are being budgeted for rebuilding and
yet somehow on the ground you know the stuff wasn't rebuilt and well where's that money gone
and it all it takes is the light to be shone on where that money goes and it will be spent in the
right places whereas when there's darkness it goes into people's pockets so that's super i mean and
and this would seem this would seem like like the reason we talk about here committee and and how
that leads up to this and and even after you know two plus years after that happening is that you
know recently the fbi arrested a fema um employee and the ceo of one of the companies involved in
one of these contracts so this is still relevant today and there's ongoing investigations to
multiple you know some politician has a daughter working for you know ten thousand dollars a month
in some like unrelated thing in the government and now we're starting to see this things like
popping up and it's like yes we're fighting against like corruption super well do you do
you know if any of these federal agencies are using your work to help in their investigations
or do they have their own tools how does so so i don't think so um at the end of the day this
the actual contract they are available you have the office of the controller um i do know that
you know there is like some um i don't know journalists kind of using this tool instead
of the controllers tool um just because it's it sucks less um and it's you know you get some
insights um even at the simplest level you get some insights of spending over time um and just
doing you know correlation between you know i don't know um a contractor being entered into
the system with a typo and you can still kind of search through that and see oh you know these two
contractors are the same entity um and they've been awarded this amount of money and you can
see the actual pdf there without having to download it um so yeah so like at the end i
wouldn't have
wanted to build something like this. I would have
wanted the tool put out by the
opposite of controller to be this.
But I'm
happy to
have done it and
keep like...
I spent a lot of time
figuring out how to gather this data.
It's definitely not open data.
It's on their
site somehow and I managed to
just download it very slowly
and make it happen.
So, like, if now I can, like, empower, I don't know, some data scientists to build cool visualizations on something else that I don't know, like, I don't know, spending by a particular geographic location by fiscal year or something like that, they can now do that without having to spend all this time and effort into actually gathering the data.
Super.
No, that's fantastic. I mean, to make this about me, I spent two years of my life on a project with school data in the United States where every state has all this reporting data on these various tests that everyone has to take at third, eighth grade, high school, all these types of things.
And I probably just got a teeny glimpse of what you had, where it's just a nightmare getting this information. I mean, first of all, you know, some states had it online. You could just download the CSV files, you know, like Massachusetts, California. Some states, you know, it was like a PDF. I think it was, I think it was Kansas who said I had to file a Freedom of Information Act request to get a CD with the data.
you know and it's just i so i guess it's a combination of nobody has the ability and
impetus to actually pull all these things together and you realize how futile all these government
agencies are where they're just checking their box and it's like no it doesn't even occur to
anyone to use it for these broader like well what if we wanted to analyze the data what if
we wanted to use it for something they spend millions and tens of millions collecting it and
then it just goes away deeply more deeply sinister than that is that there are institutional
biases against making the data openly available because people in positions of power have a vested
interest in keeping it quiet sure so it's important that people like jose do the civil society stuff
to bring it out into the light and call them to account that's just that's how democracy is
maintained that's how i'm welling up it's yeah well and that's one thing i i love about django
is it's i think unusual its roots in journalism and investigative journalism uh you know from
the beginning right i mean jacob worked at 18f i mean simon's on a fellowship at stanford on this
i mean i kind of love that you know non-pure tech background of django whereas you said you you
wanted an easy way to display this data and the django part didn't stand in your way it was
the collection and all the rest yeah i mean i keep going back to you know django because i
hate reinventing the wheel i just want to get stuff done and so so it's it's you know i have
a project template that kind of has some of my preference baked into it and i feel so comfortable
yeah no i use that it's one of my favorites i'm always following the updates wait you don't use
mine carl no use yours is fine but i use those days okay sorry um competing over free tools yeah
yeah so so it's it you know it goes back to not having to reinvent the wheel all the time like
i just want to make things and do it fast and you know kind of get out the door as fast as possible
and you know django and python are the things to do that um as we wrap up or any plugs you want
to make for us you have a personal site um jpdia.com which will link to anything around
side projects or auth zero you want to mention as we head out the door so i mean auth zero is
always hiring um so i'll plug that link um in the show notes um you know i'll also link to that
by gotham talk um it's something i'm really passionate about um i also set up my kind of
sponsorship profile on github so i could maybe link to that um i i'm hoping to kind of get back
and track with um some of the projects that i've i'm working on um that i know i've kind of over
time let you know kind of sit away and and be unmaintained um and i kind of feel really bad
about that um because you know people actually use them um so i'm hoping to you know start
spending some more of my free time kind of getting them back to date especially like dropping to
python 2 support right but let's let's just let me just say i just have to say that you mustn't
feel guilty because the time that you give to open source is phenomenal and it has been phenomenal
over a massively
sustained period of time
you've got a life
you've got family
you've got work
you've got
other commitments
the guilt is
is not necessary
your work is
stands on its own
thank you
right
well again
I'm so pleased
we could have you on
my pleasure
great thrill for me
great thrill for me
to meet you in person
you know a year ago
so yeah
we'll link to everything
thanks again
thank you for having me
thank you for coming on
ciao
ciao