Transcript: GeoDjango - Anna Kiefer
Hi, and welcome to another episode of Django Chat. I'm Carlton Gibson, and I'm joined as
ever by Will Vincent. Hello, Will.
Hi, Carlton.
And today we have Anna Keetha with us. Anna, we're going to talk about GeoDjango. So how
are you, Anna?
I'm doing well. How are you both?
I think we're marvelous, but thanks for coming on the show.
Yeah, happy to be here.
Well, awesome. Let's kick off. Perhaps a good way of getting, if you could introduce yourself,
tell us about your background, how you got into program, for instance, or how you discovered
Django and such things like that.
Yeah, yeah. Actually, so I got into software development through a bit of a roundabout way.
I was not sort of formally, you know, trained as a computer software engineer. But I was
working at World Wildlife Fund on their renewable energy team. And I was tasked with helping sort of
design build, you know, a simple website for some of our corporate renewable energy partners. And
that's where I got introduced to software development. And I really love that. And so I
did a fellowship in San Francisco. And, you know, that gave me a little bit more, of course, more
foundational knowledge. Um, and I wanted to combine energy, uh, my sort of, uh, interest
in energy and the power grid, uh, with, um, programming and software engineering.
So building web applications, um, you know, sort of track and measure the energy grid. Um, and I
guess the physical world around us. Um, yeah. And so, uh, I started as a software engineer at
Kavala Analytics, and they're an energy analytics startup in San Francisco. And they do a lot of
mapping. So, you know, mapping feeder distribution lines, substations, transformers, solar panels,
so mapping all and aggregating all this energy data. And that's where I started learning Django
and using geo django okay and so were you what were you using before just out of interest like
um well i don't think we had we weren't really utilizing any i think just postgres right and so
doing raw sort of you know using psycho pg2 or something um and connecting to postgres that way
and not really you know very hacky i guess not really using an interface but you were you were
lucky enough to start with python then yes yes fantastic yeah yeah um and so yeah when i came
on board onto cavallo we were they were using um already using python and already using uh django
i don't think they had started utilizing geo django okay so tell us about geo django what is
what is GeoDjango? Yeah, so GeoDjango is Django's sort of answer to geospatial
mapping. So it's their geospatial web framework. So it's just a contrib module, basically,
through Django that handles geographic information. And it offers, you know, the ability,
like Django, to, you know, have models and classes. But in addition, it offers the ability to
handle, manipulate geospatial data, transform it.
So when you say handle geographical data, what sort of things am I going to do with it?
What sort of uses am I going to have?
Right. Yeah. So I think maybe the easiest way to describe this, maybe let's take like,
you know, type of data you might want to model, right? So say a road.
And so just like a regular Django class or model, you might have, you know, a road has a name,
a road has maybe a length, a road has maybe the state that it's in, all this sort of,
you know, metadata that are just strings. But a road also has a geometry associated with it.
And so GeoDjango allows you to specify a geometry field.
And this can be a point, a line, a polygon, you know, a multi-line, a multi-polygon.
I think there's like eight different types, something like that.
And you basically specify this.
So a road, you know, road would probably best fit as a line.
And through that, through that attribute, you're able to do transforms, you're able to do other sort of geospatial and distance queries.
Yeah, could I could I could I create a TomTom type app like a, you know, if I had all the roads in this.
You're dating yourself with that reference.
What's it? What's a TomTom?
google maps oh yeah could i do google maps like um turn by turn direction i guess is right could
i do something like that uh yeah we did not you know we did not build some something at all like
that well you know we had a map and we had but but yeah that and other some other fancy you know
algorithms and optimizations you could in effect create google maps where you enter in you know
the address of your destination, and it geolocates your specific point.
And using GeoDjango, you could route to that point, you know, the fastest way or something.
But Google Maps is quite pretty, right?
It's got those nice tiles, and it shows me the pictures.
But how do they do that with GeoDjango?
Right.
I can't use, can I use Google Maps with it, or do I?
Yeah, you could use Google Maps API, certainly.
With a, you know, using GeoDjango as the framework, backend framework, you know, so what the user sees on the front end, right, is just, you know, the Google Map, those are like Google Maps tiles.
Of course, Google Maps also uses like all the API layer and all the fancy things.
But you could certainly use, you know, say Leaflet or Mapbox with like different actual tiles that are, say, the Google Maps tiles.
and that is just the, how do I describe,
like the, I guess what you're seeing, what you're viewing.
It's like the skin on top of the...
Right, on top, yeah, yeah.
Right, so I've heard of Leap.
Well, I was just going to say,
so I've used Leaflet and Mapbox,
and I was curious for you, Anna,
how would you describe the options available?
Because, I mean, Leaflet is open source.
Mapbox is open source data, though they're VC-funded.
and then google is their own thing i think because google i'm remembering i'm dating myself five six
years ago google's did one of their periodic we should monetize our api things for google maps
and charge like hellacious rates and then everyone got all excited about kind of mapbox and these
right source um so so what's your take of the landscape right because i know now there's apple
maps yeah yeah i guess the open street map is a different is another right which i think mapbox
uses that data right i think or used to they were a skin on top of so i guess the two parts are
there's like proprietary data which like google and apple don't share and then there is open
street map which i believe is what powers mapbox and leaflet yeah and leaflet is more of a open
source whereas mapbox is also open source but has raised a ton of money and right right i think
ultimately mapbox is this is the umbrella over leaflet and leaflets the mapbox sort of free
Oh, yeah, that's right. They hired the Leaflet guy, as I recall. But anyway, so how do you think about this with projects, right? Because I simply break it down into like, yeah, there's Google APIs, but they could charge a lot. And there's open source. What are the factors you think about like for a personal project? Like, because you have some really cool ones on your site, which we'll link to versus an enterprise setting. Are there any distinctions there that you would make?
Right. Um, I would say ease, you know, ease of use. And that's where leaflet and map box to me
have really made it like their documentation is, is really great. They're way better. Yeah. I mean,
we were actually at Kavala, we were using open street map, um, and not using leaflet or, you
know, or map box. Um, and we found it a little bit, um, you know, they're, they're a team of,
I think they're volunteer, you know, so it wasn't just as full.
Anyone can do it.
Like you and I could go in and it's sort of like Wikipedia, I believe.
Right, right.
Which, you know, in some regards is really great.
But I guess they were less full-featured than we wanted.
And so, you know, Leaflet just has so many sort of bells and whistles
if you want to create a map that looks really nice.
It's got the ability to do choropleths, which are just overlays.
Yeah, it's got, you know, a really easy way to code. Let's see, you know, satellite toggle, pop ups, you know, all these sort of the things that you know, a user would expect to, you know, to see on a map.
Yeah. Can you say a little more about chloroplasts? Because those are, people haven't used like leaflet mapping. That's a big word. How would you explain that further? I've used it, but I, you know, it tripped me up at first.
Right. Yeah.
What's an example, right? Like, I mean, example would be something like, I don't know, a color scheme for, like you have, right? You have predicting disease spread is one of your projects.
Right, right.
Where it's redder, right? That would be an example of a chloroplast.
Exactly, yeah. So it's a, you know, way to visually, you know, display data when, say, points or lines don't convey the information you want.
So something that's like disease spread would be really, you know, a good way of sort of maybe, or where you would use a choropleth.
That way, you know, areas that are really affected by the disease
or are really dark or dense and areas that are not.
So it's like a temperature map on a weather forecast, right?
Yeah.
Yeah, I was just thinking like a political, like a red, blue, purple,
like U.S. political kind of map.
Okay.
And all of this is built into Leaflet, and that's Leaflet.js.
Okay.
Okay. And is that part of GeoDjango, or do you have to pull that in for your front end when you're writing your templates?
Right. So that's not part of GeoDjango, exactly.
When you, you know, you basically, after you, you know, construct your view, you throw the sort of leaflet, you use the leaflet API to just pull a map up in the browser.
Yeah. And display your tiles. And then you send your GeoJSON oftentimes from Django to, you know, display on the front end.
Okay. And how do I serialize? So say I've created a Django model, a GeoDjango model, and I've got some, you know, geographic data in there.
I've got, you know, my roads or the shops in my town where I want to go shopping next week.
I've put that into GeoDjango for some reason. How do I send that to the front end?
Do I need to use REST framework, or can I just send...
Does GeoJango give me the serialization tools I need just to send GeoJSON?
That would be the same way you would send data through Django.
So, you know, you would use, I guess, views.py.
It's been a little while since I've used Django, but yeah.
So you use your views.py and then your URLs, and then you use GeoDjango to, you know, once you've retrieved your data from whatever database, to serialize that data as GeoJSON.
Yeah, and then…
But the point, the question was that GeoDjango gives me the serializers I need to turn…
Right, okay.
Yeah.
Because sometimes you need to put in a third-party like REST framework or I don't know what.
Right, right.
I believe it gives you the serializers you need.
Yeah.
Fine.
And along with, you know, I think the biggest benefit or the biggest, yeah, the biggest benefit, I guess, that I liked using GeoDjango for was that, you know, you didn't need to do these sort of hairy SQL queries, you know, raw queries in your Python.
It provides an interface to PostGIS, which is Postgres's, you know, geographic database, to do these, perform these queries, you know, much simpler.
And so, you know, a query, say, taking that roads example, say you want to find, you know, all of the roads that intersect with a hospital or that, you know, run into a hospital or that are within, you know, a certain distance of your hospital.
GeoDjango provides a really easy way of filtering and doing these distance queries.
Whereas with, you know, just with SQL, it's, you know, a little bit more confusing, I guess.
Yeah, I mean, that's the Django ORM to save us from raw.
So you had a fantastic talk on GeoDjango, which we'll link to, which I think how Carlton and I both came to know of you.
But can you recall, what was the process of, like, how would you recommend someone learn GeoDjango?
Because it is a somewhat advanced use of Django and these other technologies.
Like, how would you baby step someone up that curve if you could do it again?
Because I suspect you were kind of maybe thrown in the deep end.
Or did you start more on, like, the front-end JavaScript side before?
Yeah, I started actually with Flask, as many, I think, Python users do, you know, go from Flask and then think, you know,
I need a little bit something, you know, something with a little bit more, I guess,
bells and whistles and structure. Batteries. Yeah. Yeah. And then, you know, I just remember for
the first project I did, I was banging my head against the wall. I was trying to get,
oh, it was, I was trying to get, I think it was, I did this project where I mapped landfills across
the US because I wanted to, I wanted a user to be able to enter their zip code and then see where
their trash, you know, routed to. And if that landfill was full, you know, or how close to
capacity that landfill was. Anyways, and I remember banging my head against the wall,
trying to, you know, figure out, you know, how to get this data to just display on my tiles.
And it was probably something very minor that I had done wrong. And, you know, all of a sudden,
the entire us was full you know i made must have made one tweak and all the us was full with
landfills you know yeah i mean mapping stuff that's quite realistic it's hard and then you
also you have the visual component so it's not like just like a little you know error it's just
like you're like you can see the mistake that you made yeah yeah yeah um and oftentimes you know
there's the there's simple things like the lat long you know switching those yeah i know you
Yeah, yeah. So geographic systems, oftentimes, I think nearly all the time, actually, post GIS, certainly GeoDjango, they use long, longitude, and then latitude to define a point, whereas users are a little bit more familiar with latitude, you know, and then longitude following.
Right, XY makes more sense than YX.
Right. Anyways, yeah. So how would I get started with GeoDjango? So GeoDjango does have this great
tutorial on seeding a database with some GeoJSON or with a shapefile. And then it walks you through
the process of querying that data and manipulating it a little bit. What you could also do is just,
if you don't want to worry, you know, don't want to deal with, you know, maybe setting up a database
connection, you could just, you know, import GeoDjango and grab some GeoJSON, you know, in a
one, you know, in a Python file and just, you know, use the methods that way. That's really
not using all of, you know, the things that GeoDjango offers, but it would allow you to say,
you know, um, very simple sort of, sort of queries. Um, well, cause you could pull in
like a geo JSON of like a map of like the U S States, right. Or just to like, sort of explain
how, uh, so geo JSON, excuse me. Um, but I mean like if you pulled, so if you had like a discrete
data set of geo JSON, is that kind of like, it would have like the outlines of the, am I getting
it right like there was yeah so so what would you import right like where like where would you get
it and what would it actually look like before you applied chloroglyphs uh chloropleths to it
chloropleths right right plus plus i'm saying it wrong i think it's chloropleths but i always
forget if there's one or two yeah like chloro or cora it sounds like very yeah scientific right
anyways yeah that that sort of didn't really make any sense but i guess i was asking uh
if you think of so you have your data and then you have how it's displayed and then you have
or how does geo json differ from the skins that we've been talking about like leaf like leaflet
and map box right because a lot of times you need to have the geo json yourself or get it from
somewhere right before you can apply that skin on top but it's different from the coordinates that
it would be in the database itself with um right well in your django database yeah so so a leaflet
or a map box or some other, you know, sort of front-end web framework like that
allows you to display this data.
And so the GeoJSON or the shapefile or whatever, you know,
that is the actual data that you care about that you're going to put on the map
and you're going to use Mapbox or Leaflet to, you know, get it on your map and to display it.
And so GeoJSON is just one form of geographic data, and this is, you know, GeoDjango and PostGIS allow you to transform, you know, you can turn your GeoJSON into a well-known text or well-known binary, and, you know, and use it that way as well.
Right. Well, I guess I'm trying to think. So like if you had like a table of GeoJango data, so rectangular, and then on top of that, you could get the GeoJSON or your shape files. And so then you could say, if I'm within the shape of a US state, you could map and see, does the data point in my database fit into this particular shape? And if so, apply some logic to it.
I'm just trying to verbally think about how all the pieces stack together for people who've never used this before because it is a little confusing.
Right, yeah.
So you would have, you know, your—
Like, what's the bottom?
Yeah, so the bottom would be just the data, right?
Like, coordinate data with whatever company information.
So, like, at lat, long, like, disease is 19 out of a scale of 100 or something.
Yeah, yeah, yeah. Would be, yeah, just your, your data. And then, you know, you would put that data, you know, into a data table. And again, you would assign a, you know, geometry field to this data.
How would you do that?
That's just done in Django classes.
Oh, but that's the one there's a couple options, right, for how you would do that?
Right.
Because that's like a model class.
So, like, you've got your normal Django model, like your disease rate, and then you want
to say it's at this particular location.
Is that, so you could add a point field, would that be?
Yeah, yeah.
So, say you have, you know, let's say illness, instances of illness or something, right?
You know, you could have, yeah, you could have, you know, one instance of illness would
be a row in your table and then you would define you know as you know a column that's that's that
lat long at that location um and that would just be yeah that would just be be a jenga point field
um and then you'd be able to you know take a polygon say from a different table maybe you
have your states table your u.s states or towns or yeah and you know do a query to combine these
to so you could say um how many instances of illness are within this particular state
okay and then then you'd show it on the map right right okay and this is where you know geojango is
um makes this all very easy and fast super okay and a couple of times you've said post gis which
is postgres which is like the postgres extension that enables all this geographic stuff right yeah
Yeah, yeah. So GeoDjango can be set up, I think, with a number of different databases. MySQL, SQLite.
Spatialite is like the SQLite version. So that's just a file that... And one might think that's the simplest to get started with. But is that the case? Is it simplest to get started with that or simplest with Postgres?
I guess so in the sense that you don't need to. I think that's what Django comes automatically with SQLite if I'm like...
It uses it as sort of its default.
But it's quite easy to get started with Postgres as well.
I think it's that you need to add either Spatial Light 2 SQL,
which is built into Django,
or if you install PostgreSQL,
you need to add PostGIS on top of it.
Right.
It's the installation, right?
What would you recommend people to do?
Which of the two?
Yeah, or the three.
Well, Oracle's not supported,
and then MySQL, it's a bit of an issue, I believe.
But you're the expert, so you tell us.
I've actually used primarily Postgres and PostGIS.
So I would have to say, you know, use those,
but I don't want to make any enemies.
I've heard tell it's the most capable as well.
What, it's PostGIS?
Yeah.
Yeah.
I've heard people say that it's the more advanced, it's the...
Yeah, it's, you know, they've done so much with geospatial data.
We were, you know, at my current company,
we're doing some of these queries by hand using Go, a different, you know, not Python.
And they get very complex very quickly.
So, using something like GeoDjango and PostGIS really just make these querying and filters so much easier.
Leave the hard work to people who have already done it.
Yeah, okay.
Because...
Is the idea...
Yeah.
Sorry, go ahead.
Oh, no.
Sorry.
I'm sorry.
Yeah, I was just going to say various, you know, things that geographic data can be, I wouldn't say finicky, but more like it can be easy to get incorrect, particularly with units.
When you're doing filters, you know, is it meters or miles?
There's something, there's a geography concept, and then there's a geometry concept.
And GeoDjango also, you know, reconciles the two of these.
But a geometry concept takes the, say, polygon or line, takes the shape and models it on a 2D plane, whereas a geography uses a spherical representation.
So that's sort of, you know, where the state comes in, right?
So it's taking, you know, these points and actually models them to coordinates.
And this concept, you know, you need to make sure to transition between the two at points or else, you know, you might be, you know, your units can get all messed up.
And so does GeoNyanga do the right thing there?
And, you know, if you pass a geometry into a geography, it says, no, hang on, you've got to convert these first.
Yep, yep.
And you can specify typically, I think geometries are a little more common to use, but you can specify that you want to use a geography.
And then it will it will handle all of that for you.
You don't need to, you know, be doing transformations and crazy sort of, you know, X, Y to, you know, to your projected coordinates, all that.
Right. So the more I listen to you, the more I think, wow, actually, there's just massive masses and masses being wrapped.
up here masses of maths that you don't have to know masses of sql that you don't have to write
like yeah absolutely absolutely yeah so what's the go equivalent right because i assume it
your current job the idea is that go is faster in quotes than python and therefore you can't use
all these nice things is there an equivalent uh framework for go spatial data there are some um
there are some i'm trying to think of the one they're not they're nowhere near as popular as
um let's say something like geo jango or um so you have to stifle the urge to say you know this
was solved if we'd been using what i previously used yeah in fact uh we're switching we're
switching back okay we're switching back believe it or not yeah and part of the reason is that we
were doing we were creating all of these um you know all these functions by by hand basically i
mean we're using the the go gis libraries but even with that we needed some custom functions
uh that weren't available or aren't available in go uh so actually yeah we're switching back to
using um choosing python using uh post gis um which we could have been we're certainly using
with go as well but um right so the database remains the same it's a different framework on
top of it yeah it's it's more the you know sort of python geospatial libraries um that are really
full featured that we weren't quite getting as much in in go again i don't want to make any
enemies and say that there aren't great libraries to use but um no but it's kind of interesting
because you know on on on the on the chip go is faster you know it's a faster language it's
more low level it's you know it's got higher throughput but there's developer time as well
and you know if your python is fast enough the developer time trumps it right right right well
i mean as we talk about often too here generally in a web stack the language isn't the big issue
there's a whole lot of other things that are slowing you down yeah yeah and we found the
development with like i said the development with go and and with creating these uh all these
geospatial you know functions ourselves to be a lot just it was costly developer time um and we
also aren't aren't sure we're doing it right all the time yeah right okay you know you can write
your test for your geographic function but you know you can make your test pass it's the the
beauty of tests you know you also write them you know so the test path but it still may not be
entirely accurate so um you know yeah it's sometimes an algorithm matches the test case
but right or the test cases but then yeah they're both there right yeah right well that's why you
always want to check you know does it do what i think and also doesn't do this other thing that
it shouldn't do because sometimes it just always passes right well so what so day to day like what
so what does it look like what are the challenges that you wrestle with i mean maybe it sounds like
maybe classic data science, where a lot of it is, it's a little bit of algorithms and testing,
but a lot of making the data fit and not be wrong, right? Because if it's 0.1% off,
it blows everything up. Is that accurate? Or how would you describe, like, would you describe
yourself more in the data science side or more some other area? Right. Well, I would say so my
previous role, I mean, at Kavala, well, it was a smaller team than I'm on now, which was awesome
because you've got to work totally doing everything, you know,
from front end, a lot of mapping and choropleths to the API layer
to, you know, the data ingestion and data pipeline
and to our, you know, sort of energy, developing our energy methodologies.
And now I am on the data engineering
and sort of a little bit more on the data science side of things.
And, you know, I'm no longer doing, I guess, the front-end mapping,
although we certainly have a need for, you know,
for a fully-fledged front-end and map framework,
but I'm not currently doing that.
So I'm doing more of our data science,
productionizing the data science methodologies
to do things like create a power network using, you know, machine learning.
So figuring out, you know, where feeder systems and feeder lines are on the grid from, you know, from actual data, feeding it into a machine learning model, and then validating it and seeing, you know, how close we were using the actual data.
And I assume that's, I mean, that's very Python based most.
Yeah.
Which packages are you using day to day?
Which are the Wish Python ones?
Yeah, you know, right now we're not using a whole lot.
And I almost wish we were using something a little bit more.
I am spending my time now doing a lot of these raw SQL queries in my Python.
Maybe you get to do an outer join, right?
There's a discussion online.
Simon Wilson and some others were saying they've never in their career done an outer join.
Oh, really?
That's asked in every...
Yeah, in every interview, right.
But you've never, you know, in a 20-year career, Simon said maybe once or twice he'd used it.
Oh, that's crazy.
No, I would, yeah, I do.
Well, I mean, they're, I wish it were just an outer join, right?
It's like an outer join with, you know, various other, like this, you know, get all the feeder lines within a certain distance, but then match them to the buildings, you know, and then do a sum of the aggregate on the buildings, you know,
So it can get pretty hairy.
Is that something where, do people talk about like GraphQL in this context at all, given
that you have, you know, Go like non-relational kind of?
Yeah, yeah.
So before I was on the data team, this data engineering team, I've only been on the team
about maybe a month or two.
I was in Goland where it was GraphQL.
We were using gRPC and, yeah, and GraphQL.
So postpone the decisions on the relations until later.
Right. Yeah. I'm just kidding. Sort of. Okay. I mean, it seems like, especially the data sets
you're dealing with, I mean, it's a sort of, it would make sense to maybe use a non-relational
in that context. Right. Right. And we actually used dgraph, which was a sort of a very,
I think it's small. I think it's a team of like five, five people. And it's a, it's a
graph database written in Go.
So we were using that, and it's very fast.
But again, it didn't have its syntax.
I guess we're also used to SQL and the SQL syntax.
And this really kind of turns it on its head.
And so, again, we were spending a lot of developer time
just trying to get this syntax working to query the database.
I think with, you know, obviously with practice and more experience
and as the DGraph team, you know,
that software becomes a little bit more mainstream.
We might go back to using it.
But for now, you know, given all of the geospatial data
that we do deal with, we wanted something like,
that we were comfortable in.
That being SQL.
It's interesting because like a graph database,
I've always found them deeply fascinating, but just never really been able to make it work in
the real world compared to the established stuff. But it's like, yeah, everything's a graph problem
if you look at it in the right angle. And surely we should be able to solve it this way. And then
I can use these nice little pathfinding algorithms. And yeah, yeah, it's all great.
I was really interested in to use it. I was really happy that I got, you know,
a couple months experience using it, even though we're not using it in production now. But
um yeah and we kept you know it's it's a whole sort of mind shift because we kept saying you
know i kept saying oh yeah so they're you know the rows in the table and then you're like oh wait but
they're not they're not rows right in fact it's not a table you know and you're like what okay
the nodes and then we kept having to you know change our how we how we describe this data you
know the nodes in the you know in the graph right and it's just like a different conceptual way of
thinking of data um right but it's kind of interesting we talk you know in software design
manuals they're always talking about matching the the the mental model of your users in this case
like you're the user and right like the database isn't matching your mental model and so there's a
an incongruence and yeah that's kind of cool yeah yeah and i think um something i mean i think
like we're still trying to find how to use graph databases because some of our problems are really
more suited to graph databases. Like we have something that we kind of call dependency chain,
and that is a link of sort of cascading of, I guess, dependencies, right? So if you think about
a power system. You know, power system has a dependency, say, you know, on water, maybe. So
a water plant, let's just use this as an example. And your hospital is downstream, and that has a
dependency on both the water plant and the power system. And so, you know, you've got these sort of
cascading dependencies, uh, and modeling that in a table could be pretty hairy because,
you know, you've got your, okay, dependency one, but many, you know, there's many, it's
a lot of many to many relationships going on and something like a graph, you know,
can model that a lot, a lot better. Yeah. As soon as you have one many to many,
I start going, Oh my God, here we go. Yeah.
Yeah. Yeah. How many association tables you need, you know. And that's something we haven't really addressed, I don't think, yet. At least at my company, you know, there's technologies that have, but.
Right. Well, I mean, the thing about data science, right, is at least from I was just looking when I was studying this, this is a couple years ago, there was this Twitter feed, Big Data Borat, which had all these pithy sayings. And I mean, I remember being struck. I spent almost all my time cleaning the data and then a teeny amount actually running an algorithm or something I made. And it's just kind of like it felt very, you know, janitorial. But I think that's just kind of how it is, right?
It's toil.
Like it's hard to clean it and you have to do it the right way.
And then, yeah, I guess you go down.
Yeah, so there's some quote about that.
Yeah, yeah.
And in fact, in both roles, and I think, you know, we deal with a lot of sort of students, you know, who also deal with geospatial data.
And there's, well, there are certainly uniform, you know, there's GeoJSON, there's uniform ways of handling geospatial data.
and making sure that, you know, the data is the same from one government, you know,
one organization to another, a lot of the format is not the same. And so we can build an entire
data pipeline for, you know, our US data, right? So that's ingesting US census data.
But we have the need now to ingest Japan census data, and it's entirely different. So that data
pipeline, you know, we need to change drastically in order to get it to work with, you know, these
these files basically um yeah and i always get the impression that data pipelines you know i'm more
of a web jockey but you know data problems always like the sort of one-shot scripts that i write for
myself to to do stuff and they're sort of bodged together with duct tape and oh you know and i'll
just run this here and it'll kind of work good to go yeah right is that your question is is that
accurate yeah yeah yeah yeah yeah you know at least at both companies i've been in we've we
don't you know it almost seems like once the data's in right you do it once almost you know
and then you don't have to do it again we're not talking machine learning models or anything we're
just talking static data and so you know you have all this yeah the duct tape and you have you know
it's all shoddily constructed but then it's in right and you don't really and every time you're
doing that you're in that process you're going oh gosh we need a better way of doing this we need a
better at doing this but then it's in you know and you kind of forget about it and that's right
and then you know that's everything yeah yeah right yeah well you don't want to prematurely
optimize it right i mean right don't just polish your tooling got to use it yeah yeah yeah and so
like right now we are developing you know well knock on wood but we're trying to develop a you
know a more flexible fully featured data pipeline that we can use when we deploy say you know in
japan and china and india and you know where it can fit with uh you know i i still don't know how
this is gonna work just being the one just being the one to to you know build the to duct tape the
pieces together um i can yeah i'm i'm questioning how we'll find a solution but exactly that's what
makes it fun yeah that's the programming challenge right yeah well i i think that's how often do
these does this change right right like because because some data sets you have to ingest you
know minute by minute if it's like power meter and then some are i don't know you know monthly
yearly yeah and census data is every 10 years right and so you know if you only have to do this
you know hacky construction once every 10 years right you're like that's not so bad you know we
don't want to spend yeah but but exactly but something like um like pricing like um energy
pricing data is like every i don't know every couple minutes um or meter data like you're saying
is every couple minutes well yeah because there's some um in the boston area there's a couple
companies they're using django to uh yeah like to uh internet of things to check on right so solar
meters and it's as i recall i mean i kept i remember being like why don't the device manufacturers do
this but basically they need a degree of separation so it's um yeah i won't name names but it's it's
interesting it's a very interesting problem and it's kind of shocking that it um isn't addressed
by a big behemoth that i'm aware of sort of smaller places that are doing this monitoring
service for these power grids yeah um yeah i think it's um you know we're still at the forefront of
all the sort of smart grid and grid optimization work um and it's going to be exciting to see
uh you know how see how it shapes up because we have this old infrastructure
and then you know you're trying to glean insights and you know figure out basically you know how to
optimize you know power on the grid um right well and it's i mean internet of things is i i would
think Carlton like a pretty good use case for you know async Django stuff I mean depending on
async anything but yeah well right async anything right I mean it's aside from you know a chatbot
or something it's if that's a real world application when I think of async up because
Django is adding async in 3.0 that's the first thing that comes to mind for me all these
things we have something just pounding off data you know every second or faster yeah so whenever
whenever you need anything to handle an event at an unknown interval then async becomes perfect so
internet of things is perfect you know it doesn't know when you go and open your fridge door but
when it when it gets the fridge door then it needs to handle it and right tell amazon to send you
some new meal call interesting right i guess it is that it's the unexpected part yeah because
otherwise you could just plan for it but it's the spikes that yeah yeah async really shines
but like come back to the energy optimization because that's fascinating there must be i mean
i read a story a while back about google and obviously they control the environment in their
data centers perfectly but they use their ai to you know save 60 on the power bill or something i
don't know the number but right can we are we going to be able to do similar with the with the
grid you talked about the old infrastructure obviously you can't just rip that up and start
again but do you think there are savings to be had and efficiency gains and yeah yeah definitely
And figuring out, yeah, there's certainly cost savings, there's energy savings, and not being, you know, an energy expert, I don't want to say too much, but there are certainly, like, we have very little information other than, you know, what your utility bill, right?
the average consumer has very little information about what the demand is on their, you know,
on the grid at a given time. You know, maybe they do want to, you know, lessen their energy bill in
some ways, or, you know, put solar panels on their roofs. And, you know, if you wanted to
change your energy provider, you know, basically areas have pretty much one energy provider and
that's it, right? Like we've got PG&E, yeah, utility, PG&E, and we can't, you know, we can't
change it. And, you know, I'm sure lots of people would like to. But I think giving consumers
a little bit more insight into, you know, their sort of energy and electricity, whether that's
through something like the Internet of Things and the smart grid, you know, that's going to be
really revolutionizing the industry.
Well, it's that two-parter of, you know, you could mail someone an automated monthly packet
It's saying, here's your usage and how it's changed.
But then you need that second part, like, here's a non-PG&E option.
Right, right.
And in the solar context, you have an interesting thing where you have people who are net contributors to the grid.
And actually, it sort of messes things up for the utilities in a good way, like if there's too many people giving stuff.
Because electricity actually has to kind of flow like water, like in the same way that L.A. creates all the farmland because they're stealing too much.
Anyways, is there anything we haven't covered that you want to address in the podcast?
We're sort of coming up near the end.
Or anything you want to plug?
Right, right.
I don't think so.
I mean, I would say, yeah, I've absolutely been very happy using GeoDjango.
And for anybody who hasn't, play around with it.
And they say that people who like maps really like maps.
And GeoDjango certainly makes that, you know, makes, at least made me sort of love maps and geospatial data and doing queries and sort of making all of these data discoveries.
And you're moving back to Python world a little bit.
Yeah, yeah, exactly. Exactly. Yeah. And I think that's about it. I did write some notes. I'm trying to see.
Oh, yeah. One thing that I just on the sort of the future of of geospatial data, if I could say so.
One thing I haven't handled much is, you know, I've handled mainly 2D geometries, but handling 3D geometries I think would be really interesting.
So, you know, my current company, we have, you know, say we might map a flood, right?
And, you know, a flood has a depth.
And actually, you know, being able to map this depth, you know, we might glean other insights that we certainly might not from 2D data.
So, you know, volume of water.
Flood prediction as well.
You know, like if you're on this volume of water, this housing estate will be under threat
and it needs this kind of barriers to protect it, that kind of thing.
Exactly.
In fact, my current company, One Concern, does exactly that.
Okay, there you are.
So you have really just hit it on the head, yeah.
Probably fine, too bad.
Yeah, yeah.
And I know PostGIS does have the ability to handle, I think, 3D data,
but I haven't played around with that too much.
But I would be very curious, too.
Okay, but that's on the horizon.
That's available, too.
So that's kind of cool.
Right, yeah, yeah, absolutely.
Yeah, I mean, I have to play around with this a little bit.
It's been, I haven't used Django when I'm doing mapping.
Well, I haven't done anywhere near the functionality we can,
so I'm sort of excited to do that again.
Yeah, yeah.
Well, so we will link to, you have a personal site.
And, yeah, thank you so much for coming on.
Thank you also for your talk last year.
I thought that was a great introduction to GeoDjango.
Yeah, we'll link back to the talk, and I guess we should link to the docs too.
Yes, absolutely.
Yeah, and it's funny that there's a specific tutorial
because there's not that many tutorials within Django itself.
People just think of the polls tutorial,
but GeoDjango is really its own kind of world within Django.
Yeah, yeah.
I think they use Spatialite in the tutorial, and they use a shapefile.
So I think they have you sort of seed your database using the shapefile
and then create a simple model to play around with the data.
Yeah, then you need a way to display it.
Yeah, I'm not remembering what mapping framework they use.
All right, Anna, well, thank you so much for coming on
and telling us about GeoDjango and the work you've been doing
and the discussion around all of that has been super.
Yeah, awesome. Awesome. Thank you for having me.
This has been great.
you know i wasn't uh sure if i'd be um you know have not having a taking a little break from
geo jango so i'm glad it went okay no it's super no i think that's the best that's the best when
you you have a little perspective on it right because when you're in the weeds you don't really
think think about it in the same way as when you're a little bit far apart from it so right
right totally all right thanks again all right bye-bye see you guys bye