← Back to Show Notes

Transcript: Security

Hi, and welcome to another episode of Django Chats. I'm Carlton Gibson, joined as ever

by Will Vincent. Hello, Will.

Hi, Carlton.

How are you?

Good. I'm good. This week, we're going to talk about security, which I'm kind of excited

about because it is a deep and important topic. And I think it's interesting when I'm teaching

beginners and I sort of say and then show them the web is a really, really dangerous

place. And it's why a framework like Django that takes a lot of effort and a lot of precautions,

it really helps you out. So I think we're going to talk about what are these classic dangers,

and then what does Django do to help you with them, and then how do you configure it?

Yep.

So let's start off. So why is the web dangerous? The web is dangerous because

there's lots of bad actors, but really bad bots out there trying to hack your site for whatever

reason. We've talked about this in other episodes. If you have a slash admin page,

you know as a default you should change that because you or i could create a little bot that

goes around to every website out there and looks for a slash admin and tries to it's much it's much

worse than that because it's not that we create that bot it's that there are scripts down there

are kits downloadable off the internet that have have scripts to check every known web vulnerability

in every known framework and how to set that to every addressable ip ip address that your computer

can reach so it's like you just have to set up your thing and fire off this kit and this will

go and find every server out there and try every vulnerability against that server right this is

this was called there's probably still is script kitties back in the day where you can download

and run something you don't know how it works but you can still cause a lot of damage so this is

why it's important to keep updated people think oh you know it's not going to affect me i've got

a small site i don't you know this security no this security your isn't if you have a weakness

in your application it's not if your application will get hijacked it's when it's right well and

let's talk about you know so what are the i just thinking of i just saw a headline uh athena which

is jp morgan's trading platform that does billions and billions of dollars of trading um they've

publicly announced they're not going to make it over to python 3 by january 1st so that's a big

invitation to hackers just on the python side not to mention um jango's updates coming up you want

to you want to stay up to date but let's talk about so the biggest risk too so okay but that's

slightly even because they'll pay people to mentor to backport security fixes to 2.7 for as long as

they have to right yes but still there will be yeah that's perhaps an unfair example it's just

for them for them it's just going to be disproportionately expensive because they'll

have to pay you know super consultant rates for every you know thing to make sure that they get

past compliance that they've done it whereas everybody else who can't afford that budget

is just going to be stuffed right i would say so what's so what's before we get into the technical

things. So the biggest risk is what's called social engineering, which is people. So that's

some individual who has access to things is compromised, either they're disgruntled,

but more likely they click on a phishing link. So someone sends them an email that looks like

a real email, they click on a link. I mean, this is what happened in the US in our elections,

some people in Hillary Clinton's staff clicked on phishing email links, click on one link,

and it goes to a site that looks like a site, but it's not. We'll get into how that kind of works.

And then that person, their level of permissions are now owned by Russia or whomever.

So the biggest risk is always people.

So you want to make sure that you set permissions.

Permissions are set appropriately.

You want to have a lot of permissions levels in your application.

You also probably two-factor authentication is a good idea.

I know there's some talk about eventually rolling that into Django itself.

But permissions, I would say, is the biggest one.

Like, for example, do non-engineers need write access?

No.

yeah i mean there's this kind of principle of least privilege right so that you know and it

might sometimes be a pain or wouldn't it be nice if i could just do this thing well do you need to

do that is it not just for the sake of security bear that you can't do that and you have to go

past this through this other person who can and yeah and it's i mean it's a tough job to be the

you know chief security officer or whatever the title is it especially at larger companies in

some ways you're you're the whipping post because it is it seems like it's expensive up front to do

all these things until something bad happens and if you don't do these things they will happen

so it's it's you know it's important to do the i mean the number two risk i would say i'm curious

what you think carlton after people so social engineering would be uh updates just keep up to

date keep up to date with django most of the minor releases are security related uh especially

third-party packages for example django crispy forms um just went from up to 1.1.8.0 but um

the one from 1.71 to 1.72 there was a uh security issue there right that i know adam adam johnson

you were mentioning i should update my books to uh fix that but yes i mean the key the key point

is keep like the minor fixes they probably you know that's where you'll find oh there was this

little injection of vulnerability let's patch that right the minor fixes are almost always a security

thing i guess what else would they be maybe like some egregious bug fix but well yeah well like it

depends on the policy right so django itself like the main the main the current major version so

currently 2.2 will receive bug fixes for the um the current for that current version but bug fixes

for new features right so if it was a new feature introduced in 2.2 then you'll get a bug fix in

2.2.4, say. But beyond that, the extended release, the 1.11.25, that's the LTS. All those long year

and a half, three years of whatever it is of security releases, they are security releases.

They would also get data loss bugs, but those are so few and far between as to almost never happen.

So the majority of patches there are security fixes

and most of them are public in that, you know,

there's a database, a CVE database of security exploits

and you publish those so that people know about them

and know what the mitigation is and, you know,

people who are interested in security subscribe to those mailing lists

and they know about them so they know what to update.

It's really important that you keep up to date.

And as I said, as we were talking about at the beginning,

it's because for every vulnerability,

someone will write the, and it's not someone evil, it'd be a penetration tester, a pen tester,

right? Someone whose job it is, is to find out whether an application is secure so that they

can secure it. But they write the kit, but the kit can be used for good or for evil. You know,

it can be used for penetration testing, which is a legitimate use, or it can be used for

penetration, which is obviously not so good. Less legitimate. And we've, we've talked to

another episode about how to update Django. There's all these great deprecation warnings.

and what you want to do is you want to go through and run oh i'm forgetting off the top of my head

what is it it's with the w flag what's the yeah so you so with so python and then cap dash capital

w for warnings right and if you put warnings all wall right it's because it reads like wall

but warnings all then all these deprecation warnings um or pending debt so there's kind

of levels a pending debt deprecation warning means it's not quite deprecated but it will be soon and

then the deprecation warning means hey you really better fix this now because it's going to disappear

it very shortly um but by default pending deprecation warnings are silent so without

this all flag yes they don't show up but you want to run that at least time to time or at least in

perhaps in your um in your ci as a different build so you can um you can see these deprecation

warnings and then you've got ages and ages to fix them so yeah like you know if there's a if

there's an a deprecation introduced in django 3.0 it will be it won't be removed until django 4.0

so you've got like three whole versions to or however many to fix it yeah like 3.3.0 3.1 3.2

and then 4.0 is that the is that the agreed timeline there's only 3.2 yeah so we've so the

way it works now is we have um the new the the the 0.2 version will be the lts and then the next

major version will follow that so 3.2 will be followed by 4.0 okay good so what else uh well

if you're curious on i'll just slightly plug myself if you're curious on how to actually

step through all this i have a whole chapter actually a couple chapters um in my book

professionals on doing this on making something production ready because the information is out

there but it may not be in the most concise form so what's another step environment variables this

is another must have for security and basically anything you don't want to put secrets in source

control so you don't want something for example your secret key in your settings file api keys

you don't want that floating around so anyone who has access to your code base any employee of any

level can see them so you create an environment variable which basically lives separately and

how would you how would you describe it carlton what's okay so variable well okay so um in the

unix environment every every process which is launched gets an environment um dictionary

basically right as it's launched with keys to strings which um the the launching process can

set and it will be inherited from the launching process hence you know when you're in bash you

might you might have you might have put export um django settings model the the export says and

pass this down to the um child processes yeah so you can store those environment variables

a whole bunch of places but you can access them in python with um os by the os environment environ

or oh the os module dot um and then get environ or the environ dictionary you can access directly

you can set environ and you can use those um to inject these environment variables into your

settings file so so when you first start django start project you get a settings file and it's

got a generated secret key for you and that's fine for when you start but before you deploy

you want to change that to use an environment variable and then you can keep that in a dot m5

file people use a lot and then you know there's various ways of launching with with that environment

with those environment variables in your in your process environment right and you probably want

to generate a new secret key because chances are you've done a couple commits and actually there's

a there's a i'll put the command there's a neat command you can use django's uh generator but

basically it's a 50 uh 50 character long string random string um that's generated that's that it's

it's really important yeah i mean if someone has a secret key that's that's session session keys

all the encryption like everything it's all uses this as kind of like a seed and if that seed's

known then you know then all of that can be um broken because the algorithms are are known and

they're it's good that they're known because then they can be audited but the seed has to be kept

secret otherwise they don't work right and again any commit ever that just one commit out of a

thousand that happens to have the secret key or some other secret there is available to anyone

who hacks into your github gitlab whatever anyone who has access to the code base so

change it yeah um but yeah and these but these things are cheap as well in the you know like

uuids or or whatever secret keys you can just generate a new one yeah yeah right there's not

much cost to doing it um and i would mention so what else uh i mean the main ones we've talked

about this in a deployment episode um we've gone at length but you know turn the bug off

um you want to set your allowed host so you can't just let any old person come in

and if you run the um that there's a deployment checklist built into django and there's docs on it

python manage.py check uh dash dash deploy it will say hey here's all the things you want to

make sure are configured properly for production yeah i get this header or that header strict

transport all these kind of things that you might do and then for me the thing that you you need to

do more than you know the sort of general rule is handling user input so yeah that's the big one the

goal here is you must always filter input and that always means always pass it through a form with

whatever validators are sensible or you know a rest framework serializer or something like that

so you filter all input and then you escape all output so if you know you take if you're going to

take user generated markdown and render it to hdml and put it on a page you must you must run that

through a sanitizer like bleach or you must escape it right now obviously you have to if you're

generating html you have to run it through a sanitizer like bleach because if you escape it

you'll just see you know html in your html page what does it mean to escape something

escape well okay so to escape something um so django gives you the um the escape utility which

will for instance take all uh opening angle brackets and convert them into um and lt semicolon

html entity so instead of it look so if you put if you had angle bracket p close angle bracket

instead of it appearing as an html tag a paragraph tag a p tag in html it would appear as um the html

entity less than the html entity greater than right in between so that it would it would be

viewed uh you'd see it as as the html on the other end of page yeah it just makes it really hard to

describe that yeah it's well i possibly i partly tossed it to you because a little hard to describe

so i actually have if i have a of a link an html escape tool that i built just with javascript that

um on my website wsfinston.com slash html escape tool so you can play around and you can escape or

what is it um uh unescape cold yeah i was gonna say it's a de-escape anyways these are it's got

to be on in unescape now anyway

all right so yeah user input is dangerous um you can't trust someone out there and you want to do

all these you know so if you use use django so sorry the escape function is in django.utils

great great folder do go and rummage through their html.html and then there's escape it's

unescape there's uh conditional escape format html there's all sorts of good stuff in there

yeah strip tags all sorts of extra yeah so we could so let's see so what are the common attacks

i don't think we covered this in a deployment um episode but just just very briefly common web

attacks you should be aware of so the top one is or one of the top ones is sql injection so someone

has your form and tries to put some sql in there like delete database well you hear about the

school where they had little johnny tables who what no there's a there's a joke about it's just

a stupid programmer joke but about how some parents named their um their son um you know

something something colon delete from or semicolon delete from you know take users where it would be

it'd be fun i know these exist but it'd be fun to build a hackable django site you know maybe

just like in sqlite and just you know pull off the protections and just let someone actually do

this because the problem is it's you don't want to play around with this on real sites and a lot

of real sites are just uh shockingly susceptible to these things for one reason or another okay but

it's remarkably hard to these days fortunately to get sql into your database directly like you have

to you you have to like you know so say you're using janga you have to get the connection then

you have to pull out the cursor then you have to concatenate your string and then you have to

like you know that stuff's known but it's not in any of the the books you have to you know dig down

and work out how to use that so you wouldn't be able to get user concatenated sql into your

database if you're using the the orm unless you tried you'd have to really right another reason

to use of a good framework like django instead of doing it raw among others so what else um

xss cross-site scripting someone injects a little bit of javascript into your page

this is why you want to automatically escape um user input csrf is a big one cross-site request

forgery that just basically exploits the trust in a user's browser so this is why whenever you're

doing a form that has a post in django you want to put in those csrf tag to load in the csrf

middleware which will puts in a random secret key basically as a cookie that doesn't let you

so so why does this work it's like if you're logged into two different um what's a classy

example if you log into a banking site a server sends back a session token again this is because

The HTTP is stateless. So each request is kind of independent. But what happens if I get in a

separate tab, I have an email open, someone sends me a link, and it sends me to something that looks

like my banking site, but it's not actually the banking site, but my browser already has loaded

my token. So basically, that person on this fake banking site, because I'm logged in another

tab, in this new tab, can act like me and can do all sorts of nefarious things like take out money

because i'm still logged in so this is i mean sort of the fundamental trade-off of the statelessness

of the web is this security problem of how do you know that another tab open is actually the

correct one yeah and so what you do what the csrf does is it injects this csrf token in and it you

know if standardly it's required for all post all post requests which are going to change something

and if the post data doesn't include the correct csrf token then the post request will be

rejected even if all the other fields were valid but you still you have to add the csrf tag

yes so right so if you're rendering a form you have to add that yeah yeah i mean that's

i wonder if almost never would you not want to have that there automatically yeah no i mean like

this used to be a a big issue and it's almost not an issue at all anymore because the modern

web frameworks not just django but all the modern web frameworks handle this in more or less the

same way and it's kind of as long as you follow the practice and you use csrf yes you really are

protected you know it's it's kind of nice you know what else are there some others that well

i can go through so click jacking that's another one there's a middleware to protect against this

that's if someone puts a hidden iframe so an iframe could be when you put another site in a

site so for example if you have a google maps embedded on a site or a youtube video but what

happens is someone can just because it looks like something doesn't mean that's what it is so someone

could show an image of a kitten and you think oh that's a cute kitten but actually it goes to

amazon and tries to make an order or something on your site so jane goes uh click jacking middleware

checks basically whether a resource can be loaded within a frame iframe on the page is that is that

you basically want to yeah yeah and you basically want to say no um yeah almost you can turn it off

but almost never do you want to yeah i mean like someone do that i can't think that i've wanted to

actually embed my own site in a y-frame more than a few times like if you're if you're if you'll

discuss right where they've got that comment widget thing yes you're better then okay then

you need people to be able to embed your site in an iframe but if you're not discuss is it

Disqus, Disqus, yeah, it's D-I-S-Q-U-S,

whatever, how do you pronounce it?

I don't know, it's comment, so I thought it was Disqus.

Built with Django, I believe.

Okay, well, there you are, go Disqus.

But, you know, if you're not them,

when have you ever wanted to put your site in an iframe?

Yeah, I mean, the only, you know,

I think the classic example is just like a static site

where you want to, I don't know,

or a WordPress site for a client where you want to put in the map

so people can find it, something like that.

but on a django site um i mean you can still do it but that's the only example i can think no i

mean there are you know if you i don't know like these little widgets where you want to embed it in

some other site then yeah okay you might use an iframe there because um they're still the most

you know they're still the most powerful way of being able to do stuff on your site right but

it's also the the danger of a widget is you're trusting that widget and all their friends to be

good actors um and they may not be or maybe they are but then you're still using the widget and

someone takes control of it or you know so this is i mean so i've said this pattern a few i've

suggested this pattern a few times on the podcast but if you're in this kind of situation then what

you actually want to do is have two django apps you want to have your main django app which is

sensible and doesn't let you know it's got all your on-site functionality and it doesn't let

anybody embed in an iframe and then with the same project you just want another

WSGI.py file somewhere else that serves a an optimized and minimized Django app which just

does the iframe necessary bit and it doesn't have access to all the other stuff because you don't

want

you don't need to have all that in all of it in one application the same with async stuff you might

you you might need only a couple of endpoints which need the new async stuff or just serve

those from us with a with a different worker with a different um you know server and have most of it

going through the you know tried and tested whiskey pipeline and then just have you know

your web sockets coming into a different worker a different um server endpoint or a different

in a process yeah that seems like good advice to me so so what's the takeaway so security is

important django basically has you covered but you have to use django appropriately and that would

means i would say definitely run the deployment checklist yeah make sure you're using environment

variables if you do those two things django will hold your hand uh there's a host of smaller things

but that gets you most of the way there if you pass i'd say one more that you got it filled

filter input and escape output and if you do that the major attacks are covered yeah so three things

because it's a it's cross-site scripting right if you don't you know that's the big danger is

um but i mean but django if you're using django templates they automatically auto uh auto escape

this yeah it's for you they do and forms also do um check validation and uh forms will also

protect you right but the the day the difficulty comes when people want to for instance allow user

submitted urls okay so then you've got a there are filters like um uh you are iri to uri where

which you should run any user submitted um urls through to make sure they're incorrectly url

encoded and like escaping's a bit more complicated than django does it all for you yes if you just

But if you just take the user-submitted content

and you put it into a template, Django will auto-escape it.

But sometimes you don't want it escaped

because an escaped URL won't work.

So you need then to run it through the appropriate filters

to make sure it's safe to present that URL.

Right. And you mentioned too just the Django Bleach package,

which we'll link to.

Well, Bleach is a Mozilla package.

So say you actually want to present user-submitted HTML.

So you can't escape it because it ceases to function as HTML once it's escaped.

So you have to sanitize it, and Bleach is a sanitizer,

and it will enable you to provide a very limited set of tags

and a limited set of attributes,

and anything which doesn't match those gets stripped out.

So for instance, GitHub have a markdown editor field.

right you can't put any old html in there there's a very limited set of html you try putting a strip

script tag in there it just won't appear because they filter that out i was just wondering if

so there is that django bleach package if if that if you would recommend using that versus oh i've

never used that i didn't even know so i didn't even know there was a django bleach package i

don't know yeah yeah they for me i just use bleach and i guess well this last tangent i mean that's

something i find more and more as i'm progressing in my django journey is that i see well specifically

with markdown i'm building something that i'm using markdown and there's a whole bunch of django

markdown packages but i mean i can also just pull in what is it markdown to whatever the python

library is and create a custom template uh custom filter so uh it's sort of a nice place to get when

i look at an app and can think do i really want all that functionality and all that um trust that

comes along with it i mean a lot of times it's helpful but if i can just do it manually more and

more i would rather roll it myself if i can do it concisely okay so this yeah and this comes back

neatly full circle the number two um risk we said was not updating and why don't people update they

don't update because there's a third party dependency which isn't compatible with the

new version yes so don't take on third party dependencies is is a big issue we just had one

with django today python 3.8 has been released and there's this traceback lib that we use with

the test runner to format um format tracebacks and that's great but it's got a bug that's fixed

in master um against django against python 3.8 but it's not been released yet so we now have to

we've had to insert a conditional skip this test if this version is lower than because in order to

get around that and and django doesn't take on dependencies for exactly this reason because then

we're tied on them and okay that will be released soon and then you know we'll be able to drop that

skip if but projects really get stuck oh i'm using you know this third-party package which hasn't

been updated in three years and it's not compatible with any supported version of django so i'm on an

end-of-life django because of some third-party package that i took on because it saved me half

an hour on the first day oh that is so true that's so true i think i think half our listeners

are going to go and half are going to like that's an interesting idea but right so don't take on

third-party dependencies i mean not for not unless you know it's interesting don't do it lately yeah

don't don't do it lightly is it when what's its update cadence so like you know okay you mentioned

crispy forms that have been updated hadn't been updated for a year and a half it hadn't need to

be because it was 2.2 compatible a long time ago and we've just updated a release which is 3.0

compatible and since my talk at jungle con europe i've had loads of new contributors to help me get

the bootstrap stuff going much better so i'm really pleased with that thank you everybody

right um right but the whole point is crispy forms is stable as as whatever a year a year and a half

between releases it's not a problem for crispy forms but a package where it's new you know it's

only had 30 commits it's only got one contributor and it be you got to be really cautious about

taking that on but can you can you get what you need from it without having a third party

dependency if you can i would recommend it right i almost wonder it'd be nice if there was a

like a check mark that you could see oh something hasn't been updated in a year

but that's because it's fine or because it's super stable right well but like a stability

thing because even krisky form or you know whatever package is stable there's still going

to be a whole ton of issues and and prs and if you're not knowledgeable you may well there may

be some and you may look and see well this hasn't been updated for a year and a half and i see a

whole bunch of prs i don't know if i can trust it and it turns out those are minor things versus a

boo like there isn't python 3 support that yeah um i don't know the answer to that to be honest

that that's difficult i mean that took i mean that took me a long time i used to be like oh you know

i can just bring in this third-party dependency and bang some stuff out and then that you just

find yourself getting into a spider's nips is that the right way a spider's nest spider web

i don't know but tangled up you get tangled up in these dependencies which

aren't really giving you much value okay that's a lot to say on security hopefully that helps

everyone we'll have links to everything as ever we are at django chat.com chat django on twitter

and we'll see you all next week bye bye bye join us next time