Transcript: Official Django MongoDB Backend - Jib Adegunloye
This episode is brought to you by ButtonDown, the easiest way to start, send, and grow your email newsletter.
Hi, welcome to another episode of Django Chat, a podcast on the Django web framework.
I'm Carlton Gibson, joined as ever by Will Vincent. Hello, Will.
Hi, Carlton.
Hello, Will. Today we've got with us Jib Adigonloe, who's a Senior Software Engineer at MongoDB.
Hi, Jib, how are you?
Hey, hey, how are you doing? I'm doing great.
I'm marvellous. I'm marvellous. I'm really excited you've come on. You've got some really exciting news because Mongo just released the preview of a new database backend for MongoDB for Django, right?
Yes, yes. We just did our public preview beta of our release called the Django MongoDB backend. And it quite literally is what it sounds like. A MongoDB backend for Django.
okay well that's that sounds super i'm i'm going to pause us there on talking about that because
we're going to nerd out on that a lot before we get there i want to know i want to know a little
bit about you and you know who you are and you know how how come you get to be building this
thing so you know do tell us yeah um so i guess i'd start at one you know the beginning of my
sort of software engineering journey journey um i you know came into sort of cs as an undergrad
and over in 2014, I'd always known I'd like CS
and I was actually really interested in database systems.
So a large bulk of my CS education was actually based in SQL databases.
So, right.
Yeah, of course.
And so as I actually, by the time I graduated, though,
the first thing I had done is I think a lot of computer scientists did
at the time um was trying to build something of my own and from trying to build something of my own
it literally in the bike setting bike shedding conversation with uh with my ragtag team of
friends they said hey can we use this thing called manga db and i was like what is it and i said no
sequel and i was like absolutely not i've literally spent yeah but eventually eventually i actually
got battered down and uh i went ahead and i used mongodb and i will say i am now like a mongodb
evangelist um so really um i started like once once i graduated started seeing more and more
um and so when i got to do my first job over at meta i started seeing more no sequel and more
pieces like that uh and you know after kind of doing the gamut around the the tech scene i'd
seen an opening for the very thing that you know had kind of like been the defining factor of me
leaving sort of academia and going into you know the business side of things and there was a
position open to be a senior python dev over at mongodb and i was like yeah absolutely you will
see me there so are you are you the lucky man doing his dream job is that what you're saying
basically you know i'd hate to romanticize things but it feels that way for sure
can i ask what were you um doing at facebook meta were you working on databases as well but
or no no so at meta i did i i did a lot of things um so i first started off at
what formerly known at the time as oculus research it's now facebook ar vr or i think
reality labs. Um, and so I did a lot of the stuff around the, what's currently seen as like the
headsets. Um, um, and then, um, from there I had transferred over to what they, you know,
turned the blue app to Facebook proper, um, where I did, uh, quite a bit of machine learning
ranking work for the groups, Facebook groups, um, activity feed. And then from there I decided to do
yet another pivot and i worked uh deeply in the traffic organization specifically uh dealing with
live stream uh video network protocols um so on across the facebook or a meta family suite of apps
so instagram um facebook uh uh oculus even so um if it had live we we were we were very very much
in tune. And like I said, I started out in a life of databases. I found myself in testing
infrastructure, then machine learning, and then network protocols. And I was like, let's go back
home. Let's go back to it. Yeah, enough of that already. Well, I have to ask just a little bit
about machine learning because it's now public that DjangoCon Europe is happening in April in
dublin and carlton's giving a talk i'm giving a keynote on django and data science machine learning
so i'm fascinated to hear from people who've actually done it in a professional capacity
because i'm certainly new to it i know the web stuff a little bit but it definitely feels like
two separate realms like people who do data science don't really do web and vice versa does
that match at all with your experience yeah um i'll say that the like one thing that i was
introduced to was true the vastness of computer science by jumping into this machine learning
space when i had done the transition to the team i had almost openly said i have no experience here
um and so um and i thought hey maybe i would have experience given you know i understand
scaling large scale databases and things like that, there are definitely some notes that you
can pick up in terms of how do you ingest all this data to feed it into a system that's getting
trained or that is using the information as training data in order to generate informed
decisions. So there's definitely an aspect of if you know a lot of really complex SQL,
Yeah, that's a great place to get started. Right. But I'd say the big thing that I've learned, like when dealing with applied machine learning was understanding that, you know, the components you learn about in theories.
right uh the the things about hypotheses um and um building nodes and things such as that
they all kind of map to a larger thing which is the larger like like what is the genuine machine
learning pipeline what are the machine learning componentry that stack up in this pipeline to
actually give me the result because even um i'm not sure if you're familiar with the current
generative ai things those similar like sort of principles kind of map where um the ai or the
ml the machine learning to be specific machine learning because generative and machine learning
are truly like they have they've kind of like uh a fork in the road now um but yeah the machine
learning aspect is well what part do i need to apply machine machine learning functionality to
to then pass it back off to you know like that sort of what i'd like to call the bread and butter
The meat and potatoes engineering aspects of still parsing, processing, and generating output for the information.
And so what that looks like sometimes is, hey, I've got 1.7 billion users, right?
And I want to suggest them a group.
What can I look at?
well, I know that they've got some, maybe they've got some terms that they use that are public
access content. And I can do basically a unsupervised learning where I just kind of
match up commonly placed together keywords. The output of that are like tuples of just like two
to three sort of keyword pairs or word pairings. And you actually can't make any sense of that,
But you now know that people talk about the word dog and chicken a lot in a post, right?
Like that's the thing that the clustering algorithm told you.
It's like dogs and chicken, they're doing something there, right?
So now we have that raw new piece of information.
And we're like, okay, so we know that we want to look for posts that say dogs and chicken.
So we then run it through another algorithm.
We then actually run it through some SQL where we're querying for posts or information that are specifically using both the terms dog and chicken simultaneously.
And then we feed it through yet another machine learning algorithm to understand, well, okay, of these phrases that we've now isolated, how can we use it to understand or like piece out sentences that actually make sense?
And then finally, we piece out the sentences that make sense. And then we feed it back into our usual like ranking algorithm of what's getting the most engagement. We we score that on against engagement. And then finally, boom, you've got the output you expected. Two out of four of those stages were machine learning. The other two stages were just, again, like us kind of making sense of the of the machine learning application.
And so it's that fine tuning where you build a machine learning, like a machine learning algorithm and applying it and just building a sense of confidence around it in that model that becomes the crux, at least of my app, sort of like application of it.
Well, that's one that we often make the point that people talk about, like Internet scale with some things. And if you're working at Facebook with billions of users, that's like that counts. That's one of the few times it counts.
yeah so wow well i don't want to get too off track but that's all like super interesting and
um yeah we could we could talk about that the whole time but we're going to talk about other
things so carlton go ahead well okay go on then we'll move on to your meta so tell us
not meta so your mongodb tell us about what you do at the longest your senior senior software
engineer there yeah so um like i said i'm a senior software engineer over at mongodb and
And if I were to describe my role right now is sort of on paper, clear cut.
I deal with the like our I work on the database experiences Python team, specifically also known as the drivers Python team or Python drivers team.
And what a driver is, is essentially the, like, the library or set of mechanisms that allow a user, a developer, namely, to easily connect to their Mongo, like, their database instance.
So, in this case, the, like, our mainstay, our core language or our core library is PyMongo, which is the Python driver for MongoDB, right?
Or the MongoDB driver in Python.
Yeah.
And as a Python developer, that's what I install.
I pip install PyMongo when I want to get going.
Exactly.
And so the main safe thing we focus on and we maintain is that library.
There are several subsidiary libraries that we either contribute to heavily or maintain.
Some of those being Mongo Engine, which we contribute to, or an example of one we maintain ourselves, PyMongo Arrow.
And then additionally, there are several others, especially like in the AIML space, like Langchain.
We have several contributions there and, you know, sort of built a relationship and rapport.
And then we've also got the web frameworks that we've done our best to integrate with, namely, you know, things like FastAPI, Flask, and now our Jenga.
So we try our best as a team to integrate there.
Okay, so that's super.
I've got to ask, because hang on,
querying Mongo is nothing at all like using the Django ORM.
So hang on, what's going on here?
How on earth are you integrating with the Django ORM?
Great, great question.
So one thing I've been saying,
and I'm actually eager to get checked on it,
is that when I look at Django, when I've viewed Django,
I've also always been like, oh, this is a SQL framework.
That's what it is.
But when you start pulling
or piecing apart the layers under the hood,
you realize that the way the system works is,
yes, there's the Django query set API,
like that framework piece, the componentry
that allows you to actually issue queries to the database.
At some point that gets converted to explicit SQL,
like structured query language.
then that gets fed to the database what you realize is okay so there's a set of api calls
that actually generate the sql so technically if i just remove the parts that generate the sql
and replace that with mql the mongo query language it should work right um and like i said hang on
hang on jim you just said technically if i just replaced
yeah yeah sure sure sure we'll just do the mongo query language in there no problem
go on carry on so yeah um of course i'm i'm definitely being more um uh sort of
oh laissez-faire about my my explanation but um to be more specific um so with each so like let's
you're doing a standard lookup query of like say i want something that starts with i don't know
the letter z right um forgive my sql here but essentially you'd be doing select star from
you know select whatever from specifying the database table where starts with the letter z
that's what it would be structured like in sql right so for us when you do that lookup call
instead of calling you know basically like this conversion of as sql we have embedded a callback
that mirrors that same behavior where instead of calling what as sql we're calling as mql
and this as mql um will uh is essentially um doing uh taking that step of where
it's seeing the starts with, and it's making our MongoDB analog. We actually do have a predicate
that lets you regex match against the beginning of a string. And so that's like the one mini
component. But the thing that therein wraps it is we've created our own customized compiler because
what Django is actually speaking to is a SQL compiler. And so we've made our customized SQL
compiler with an embedded Mongo query generator that at the same steps where
you would call these, that SQL where clause to generate, um, to generate your SQL
cool. We call, we call that where, and it starts generating what is the syntactic equivalent of a
where in MQL. And then, and then the, the stuff that's wrapping it similar to how it's wrapped
in Django, where I was talking about select star from that's, that's kind of standard information
that you get from having the model exist. You already know what you're selecting from. That's
the model you defined. And you already know what your columns are because you define them in the
fields. So if you know you want X fields, we already know that by virtue of having the model.
So when we, it's very easy for us to now construct our MQL equivalent and say, all right, we don't
have a select star from, but we do our match statement on starts with, and then we do what's
known as a project statement to say actually just give me these specific fields that i specify
and so the dollar projected mql looks like works in a very similar way to select so you know to
select star from right or you've got and the orm has the only filter where you can select specific
fields i guess yes implemented that as well yes and so what we what this led to what this leads
to is an almost completely obfuscated layer of mql that a django developer does not need to know
in order to get work done right and i and we've understood that this has been something that has
been a plight um um for no sql developers for some time well i mean there have been um attempts at
this right i mean based around the um the google app engine um data store originally i can't remember
what it was called but it and it kind of it kind of worked but only only kind of whereas you've
gone a step further right you've got pretty much all of it works now right yeah yeah and so to even
and it's great you brought up the google app engine um because i would say that one that was
a product of almost 10 years ago now yeah yeah right a lot's happened in 10 years at that time
mongodb um was i believe operating on a version uh less than server version 4. we are currently
trailblazing on mongodb server version 8. we have a wealth more of uh of querying predicates that um
do mirror uh a lot of the things and expectations over in sequel land but also like we we too have
built a unique set of things that have made us a significantly more powerful database than
the implementation strategies of 10 years ago. And then moreover,
those implementation strategies were using basically a different paradigm. Right now,
we're leveraging the unique power of the MongoDB aggregation pipeline.
I'm not sure if folks are familiar, but just to walk through the MongoDB aggregation pipeline,
But it is, it's not necessarily new, but it's MongoDB's sort of premier service in creating very, very intuitive and complex queries that can really stand, you know, production level work.
And so how it works is there are, you've heard me say the term predicate, right?
So they're essentially these aggregation operators and this aggregation pipeline.
And I want you to imagine like an assembly line of instruction, like an assembly line, a traditional assembly line.
Right. And so the MongoDB aggregation pipeline works kind of like a traditional assembly line where each predicate is its own isolated augmentation onto the collection or the database that you're working against.
And so you can quite literally intuit what's going to happen next because it's procedural, right? So if I were like using that starts with the example, right? I know, let's say I want to find something super complex. Like I want to find every movie that made over $2 billion in the last 12 years, right?
You could spend a bunch of time trying to think about how do I do this in a very increasingly nested structure?
Or you could say, first, let me match on every movie.
Right.
I want to group every movie by all of its total sales.
Great.
That's one step.
Now that I've grouped every movie by every each one of its total sales, I want to now do a greater than operation on two billion.
Like based on this, are they greater than two billion?
Great.
Now I've found all the ones greater than 2 billion.
And now let's say I want to do something like rank them.
Third procedural step.
All of these isolated buckets that can be swapped around,
interchanged, and tested in real time.
And that kind of fits with how the, you know,
when you're using a Django query set, you know,
you add a filter call and under the hood,
it's doing an add queue where it adds a queue object.
That's exactly the same kind of predicate.
And so that's why I've been championing the point
that django isn't actually a sequel framework oh good i love that i love that that sounds that
sounds like a conference talk for uh django con us it's in the it's in september in chicago yeah
yeah um and the the team will be at uh django con us and um we will actually also be giving a talk
at django con eu so um okay um which actually jumps into my third point about why this is
different. I hate to put it out there so plainly, but in the past solutions, MongoDB may have been
a passive member or we may have helped influence the solution. But in today's iteration, in today's
version, MongoDB has put their full force behind it. We've got dedicated engineers working on it
as well um we've uh enlisted uh the consulting power of tim graham a former django fellow
um so all of our decisions are couched in both django specialization and mongodb specialization
so there will be times where we truly like as a team come together butt heads and have a real
discussion about like okay in this in part of the implementation are we more django here or are we
more mongol to be or can we move up right and because we're now taking it from a from two from
deeply specialized angles we're able to say more concretely whether or not this is a decision that
stand that will stand the test of time because we know we we've got somebody here um two people in
fact that can speak very very acutely to what a typical django developer looks for and then we got
employees who you know deal with a monger to me every day and so i think even in that relationship
we've we've formed something richer um and and we folded it into our into our quarterly plans
our yearly plans it is it is something that we we're committed to continually iterating on well
that's super important i think like the trouble with third-party back ends historically has been
well who's maintaining them who's developing them where's the engineering time to make sure
they keep up with the changes in the in the orm as django django evolves and like for mongo to
allocate that in engineering time to it is that okay that gives us a you know some some degree
of confidence that it will continue to evolve compared to say other you know other back ends
from companies that we won't bother naming it can i ask about the so now there's support for
the three big web frameworks, right? So there's already support for Flask and FastAPI and now
Django. That's correct, right? Yes.
How much overlap is there? Because you're going, like, there's the Python space and then the web
space. Can you reuse the Python stuff at all? Or are they completely different, the drivers
for each of the three? So they all use the same PyMongo driver.
they all leverage that same core driver um so in terms of that there's definitely overlap because
we can use it but in terms of how you we view or we've treated each one i would say that um uh from
from my experience like the flask and the fast api they are they're sort of closer and in wheelhouse
um based on their implementation um you know flask being sort of like the least opinionated as we
know like hey you just want to start something up here you go right fast api having a bit of an
opinion um specifically the sebastian's template generator is a little more opinionated because it
helps you scaffold things but in terms of like letting you kind of pick and choose your own
adventure fast api still allows that leverage django very very on the other end it's well the
the other two don't have the orm right they don't have that yeah i mean but they have sequel yeah i
guess that that'd be like a question like you know so sequel alchemy i mean okay they don't have
their own thing but i mean flask certainly you use sequel alchemy in most of the cases i'm familiar
with um or or disagree with me please if that's not the case oh no no no no you you're right i'd
say that the thing is like the usage of the orms are not as tightly coupled right like so and
because they're not as tightly coupled it's just like for us our decision is i think it's just
better to make it easy to use or leverage our like our driver directly versus in django that's
kind of antithetical to the experience right if yeah no historically that's it you'd be like
you've got your django view and then all of a sudden you're bringing in you know the the library
for whichever you know whichever um no sql database you're using and it's like well hang on this just
doesn't feel like Django and the whole point with integrating the ORM is it will just feel like
Django still. And I'll say it that, and that's where the, we, we understood the thesis was
different, right? Like when you go to Django, you go to Django because you're not, you're not
just there for, um, something that allows you to spin up a website. You're there because I want to
have a solid authentication system. I want a solid administrative management. I want solid
session management i want to be able to know that when i build a form it's all automatically like
being able to like take the information and then write it to the database in such a way that i'm
not thinking about this like 24 7. in the prior solutions they're still going to be a somewhat
non-trivial amount and that's not to their their detriment it's just there you're building it
yourself right so right it's gonna be it's just the you're building it yourself and we understood
that in django if we if we're making users build it themselves then we're kind of going against
like what django says on its website so i have to ask are there corners corners yet that you
aren't happy with that aren't finished that you absolutely um if there's one thing i i stand by
like this like um one you know we're in beta but two even after we uh go for our general availability
our general availability release later this year there's still going to be more work to be done
because at the end of the day fundamentally sequel and no see there's there's a reason it's called
no sequel um but to yeah but to answer the question in the present right there are a dirge
of MongoDB specific things that we are aching to get in, we're just trying to get it in the right
way. So one example is in a traditional like SQL structure, you're looking at the usage of foreign
keys. That's second nature, right? For us, we also support that ability. Like it's like if you want
to use your foreign keys, go ahead. But we understand that what makes MongoDB performant
is moving away from that paradigm
and using what we call embedded documents.
In this library, they're known as embedded models
and they're viewed as embedded model fields.
The power of MongoDB is that you can have a document
with tons, like hundreds, thousands of sub-documents in them
and then query against that.
Because this is not something that's really been done in SQL,
Like I know there's HStore, there's sort of like sub-objects and things like that, but to the degree in which they're done in MongoDB, it's a little bit different.
And so we've been working on ways to get this nested embedded model structure to be airtight, but also work very intuitively in the Django ecosystem.
And so that's one area that we're actually barreling down right now.
and and we are like absolutely set on making sure that that is a fluid experience for folks
um there are other things that you know um that we're that we are um we've run into
corner case situations on usually if it's like just a an issue around like sequel then we just
we ignore it like well i can't support these sequel functions it's okay um but yeah i'd say
like a lot of the like we we're we're running into things where we want to improve more iterate and
introduce more MongoDB specific aspects and I'd say that the like as we are personally going
through making sure those work um we're also hoping that during this like beta phase people
tell us what isn't working um or what they find to be weird so we can immediately kind of capture
those yeah this episode is brought to you by button down that's button down.com email software
for developers like you there are hundreds of email marketing software services out there and
they all pretty much offer the same thing collect and clean addresses send out broadcasters or drip
campaigns get analytics so you can see what's resonating and what's not button down is designed
to hook into the tools that you already care about everything from static site generators
like Jekyll or Hugo to payment platforms like Stripe and Memberful. You can hook your site up
to Button Down with just a form element or a simple REST call. Write emails in Markdown and
then get on with the actual work you're supposed to do. New customers can save 50% off their first
year with Button Down using the coupon code Jango. And if you email support, they'll white glove
migrate your existing subscribers and archives for free. I saw this in the docs. I know Async is,
a new frontier in Django, and I believe there isn't really
Mongo support for that yet, is that correct?
Yeah, there isn't Mongo support for that yet,
but that is also an area that we want to introduce.
In our driver, we are also currently improving our async
asynchronous functionality support. So, simultaneously with this, you'd see an improved
and richer version of asynchronous functionality, both by the baseline driver, and we hope it
complements Django as well. So, we aim to have support, but for now, we figured it's great to get
that that sort of bread and butter synchronous um uh ability uh good to go um and as we you know
push towards gender availability where uh more production use cases are leveraged we we want to
be confident about saying hey we can also support your asynchronous case as well okay what was the
hardest thing you know what was the bit where what was the moment in the project where you're like oh
oh this isn't gonna work
oh yeah so i think uh there's there's actually so many of these um one uh i'll say you know
let me start off with what was easy so what was actually shockingly easy to me uh was um doing
that initial replacement of um like the mql with sequel that was actually shockingly easy it was
like hey here are the lookups here are the functions that the lookups will generate instead
right um it's quite literally a dictionary yeah where things got very interesting was when we had
to start um supporting things like annotations um or supporting um our version of grouping uh like
like um yeah like supporting group buy in a mongodb uh specific way because like i said
we have a very procedural um like procedurally generated uh sql statement i mean um mql uh it's
an array of it's an array of python objects if you if you want to think about it like that
and so one big thing is in order for it to be self-referential about something that happened
let's say it step number four we need to capture metadata now um and then make sure that the
metadata hasn't been mutated too much from step number four to step number seven um to ensure
that the output is the same and then another challenging one was mongodb at the database layer
handles nulls differently right so where django would expect a null value mongodb is like bro we
don't have a value at all like we're we're not we're not beholden to that and so um we would
run into those uh almost quirks where it's like okay what do we need to do how do we get smart
about this um and how do we like really get things working and then i think honestly personally
personal to me one of the biggest um necessary headaches tests i was gonna ask you about tests
so the first thing we've run into with the test was well when the django test suite runs its test
clearly it's going to check against an auto field an integer as its key mongodb does not use like
traditionally our primary key is an object id which is a um let me not get the bit and go
But basically, it's like a specialized unique ID constructed in BSON.
And whenever you submit, let's say, in this case, let's say you submit a document or a row, we don't supply that object ID usually for creation because we'll automatically generate an object ID and then submit it.
And so what would happen in the test is the test case would expect, Hey, um, this object that's coming back should have the idea of one, two, three, or four. And instead we'd have the object idea of one, a three, seven, three, five, but like crap. Right.
And so we had to we had to basically fork the test to then override the natural test structure to allow us to just use object IDs or to or to just like kind of like fake use like a different ID implementation.
And the thing is, this sounds, again, hand wavy and easy solution. But when you when there's hundreds, thousands of tests that run into this, like my new issue, and you're trying to debug whether or not, hey, is it this thing that we've identified, or something completely different? Or is it something we can even solve at all? It is something that you have to quite literally look at each test and deduce yourself.
And it's like, I think it was, for me, it felt like the most daunting thing because to look at such a tenured databases or tenured frameworks test cases and say like, okay, we're going to try for this, this, this, this, this, this, this, this, this, this, this, this, right?
And then to kind of present that to the team and then have, you know, Tim, who extremely well-informed, like, actually, no, not this one, not this one, not this one, not this one.
And like, it's just so it's it felt like, is this ever actually going to end?
And thankfully, it did. Like we we managed to triage every single test, figure out the ones that don't necessarily matter for the sake of what we're doing and the ones that we absolutely want to make sure matter based on this fork.
I think it comes out to the number of, like, 82 test suites with, you know, well into the hundreds in terms of tests.
But once everything went green, that was honestly like Christmas Day for me.
Yeah, I can imagine.
I can imagine.
It objectively works.
And how are you keeping your test suite up to date with changes that come into Django's main branch or development branch?
So, um, right now there are, um, so two, two, two sort of things.
One, we keep a, uh, because, um, you know, Django's got, it's branching is like five,
one, five, two, four, two.
We've got our individual branches, like forked branches that will keep up with whatever changes
may manifest per those specific branches.
And clearly like for main, like we, we also do our best to test against that.
And so in terms of like how we keep that up is, hey, we check with 5.1. Are we in alignment with the 5.1 test? Are we in alignment with 5.1 changes? Great. Are we in alignment with 5.0 changes, 5.0 tests? Great.
With Maine, we see something's been added. Is this something that we want to get ahead of now? Yes.
And so we have even like testing coming in, in later bits, just to continually make sure that we are up to date. And then we again, triage the issue to understand whether or not it's something we need to tackle now, if it's something we can just skip, and things of that nature.
I think that's the key, is that historically, backends have sort of matched the Django test suite, but maybe at a point of time, but then they just haven't been able to keep the engineering going to keep it up to date. Because Django keeps moving, right?
It does.
All the time. And so it's that bit, it's the kind of like, yeah, we need to keep bringing in the new tests, keep bringing in the new features as they're developed, you know, to the extent that it matches the data model.
Yeah.
I just realizing technique. Yeah. Well, you can continue. I just saying, I just realized this is,
I think we, this is now the third episode we've done related to MongoDB because we've had two
developer advocates before we had Mark Smith who's based in the UK, uh, two years ago, I think.
And then we had Aaron Bassett before that, who was on the Django. I don't think it's at MongoDB
anymore, but he was on the Django board with me. So, um, this is a rich vein of discussion.
but go ahead when when we had aaron on like this this idea of a native integration with the django
or m was just it was just no more than a twinkle in somebody's eye yeah i mean that was almost five
years ago yeah i think there was a third there was a there was a it wasn't anything official
i don't think okay sorry go ahead i've got one more question migrations how does that look how
because mongodb schemaless right no sql schemaless like what out of how did migrations fit in do i
i just div up create them as usual run them as usual and all this magic yes so one yes i'll say
this um next question carlton yeah yeah obviously come on why are you calling me on this way to say
that um but uh no it's a great question so again monger to be schemaless i will say that you know
having a lack of a schema one does not mean that you don't need to have a schema i as a person
who's done a lot of database work please always as you're like as you're codifying your work
establish a schema because you don't want developers or anybody kind of saying oh there's
no schema i'm just going to throw in a random key every now and again and then yeah that's
later down later down the road yeah i've got bad memories from about 2010 of that
But yeah, the way the migration is working right now is you quite literally call it,
it'll create your database, it'll create your indexes that you specify.
Right now, we've chosen not to enforce a schema.
Again, we're in beta because most people understand the rapid prototyping and development
of MongoDB and like enforcing the schema at this stage just feels like, you know,
We may feel like we may be jumping the gun. Right. But the beauty of we still recognize the beauty that comes from having something on migrations.
And one, now that it's codified, what your schema is, regardless of whether or not it is or isn't enforced at the database layer.
And then two, when you make alterations, migrations will also capture those alterations and make them as necessary supplying and even go so far as, you know, supplying whatever the necessary default is when it when it comes down to you.
so like in terms of migrations it works tm it's just it works uh it's just not enforcing schema
but that does not remove it the level of importance we understood and found about it
and so that's something i could do at a later date or not but one could do a later date and
even now we do have um we do have basically uh ways or like we like to call them a bit of escape
patches or functionalities where if there is something super unique to MongoDB that at its
current iteration we haven't exposed through Django's feature set we've documented a way to
just grab the underlying driver or the underlying client connection that's used and then and then
you can configure the client information under the like just right there directly and then go
right back to your Django development and it will it'll propagate because you're using the same
like the same client configuration
yeah and like even you know
even in normal Django land you occasionally
grab the connection and grab the cursor
yeah and so
if you if somebody
does want to enforce schema we've documented
how you can go do that
grab the client use the dollar JSON
schema validation and you're good to go
wow
yeah
but well you've answered all my questions
I'm like you know I'm just
itching to go and go and play now thank you well well i just i mean what what should we be asking
you you know i mean i know there's a whole lot of press that's just come out like we've touched
upon a number of things are there any anything we haven't mentioned that specifically you're
excited about or think people should know about with this this new driver yeah that in and of
itself is a great question uh so first um i think one of the biggest things that i'm really proud
we put out uh in in this iteration is uh it's called a raw aggregate now i understand there is
there's there's literally a predicate for aggregation in sql this is raw aggregate is
um is is hearkening towards mongodb's aggregation pipeline right so if you remember the dot raw
uh in django which allows you to pass a a string that has a structured sql in it we've made our
own analog but it's not just a string um it's a list of like i said python dictionaries
which in this list of python dictionaries you can quite literally construct a normal
MongoDB aggregation pipeline and what it'll do is similar to how the dot raw works it will give you
a a a Django query set object like you will get a query set object back
from a query you've done basically executing raw mql
and I think that's been that's a really powerful thing because nobody no no other implementation
has done that no other implementation has really looked through the api for all of its richness
and said hey this is something that um if this is something that's that that would be really
powerful to use especially at this stage and the reason even more so why it's powerful is
mongodb's got things that come natively right we've got vector search we've got um our full
text search capabilities we've got our geospatial querying and we've got even more interesting like
um pred like search predicates like graph lookup and things like that
and by using that raw aggregate framework you're almost you're immediately able to interface with
that and then still get django like you'd expect and use it very seamlessly or or or very well in
the flow like in the future we definitely want to improve upon this and like you know have uh
potentially like some more integrated solution
of search and geospatial.
But for now, it's a great way
if you wanna like do more advanced things
that you know or you understand MongoDB can do fairly well,
you can.
Yeah, I think that that's a nice selling point.
It's like one of the complaints about an ORM,
a multi-database ORM in particular,
is that it's kind of ends up being
the lowest common denominator is the criticism
is because you don't get the advanced features
of database A, and you don't get
the advanced features of database B,
and you don't get the advanced features of database C.
So to be able to break, or to have made space
to say, you know what, if you're using Mongo,
you can still reach out for those advanced features,
that's a nice addition.
Of course you can, you can always get the driver,
you can always do the things, but if it's,
nicely exposed in the api that makes it all the more enticing right um and then second thing uh
the um i think one big question i remember getting asked at uh jango con us last year was
hey how are you you gotta do joints
and um i want to say you know very succinctly yes we do um we actually have introduced this
uh operator called dollar lookup it's been something that's existed in mongol db for some
time and um without getting too in the weeds that is that is our joint you can execute a left
joint a right joint and outer joint through the dollar lookup operation um now my big sort of
Cautioning point is, like I said earlier, we are not like we don't that third normal form and leveraging that.
That is not what makes MongoDB MongoDB. So for for now, please go ahead.
Use it. Create a foreign key. Create a one to one to many, one to many field, many to many field.
But really, in the coming months, in the coming weeks, look forward to seeing things like
nested embedded models, and we will also provide documentation on how to transform that sort
of traditional foreign key usage to a representation in MongoDB that leverages that more traditional
document structure that we as a MongoDB know and love.
Yeah, right. Ultimately, the SQL model and the document model, the relational document models aren't the same, right?
And I'd say the third piece that I want to touch on is the third-party integrations, third-party libraries.
Yes. So like I said, like we've chatted about, we're trying to make this work with Django, period.
And so part of Django's power is all of the libraries that have come out, things like Wagtail, Django All Off, Django Resh Framework, Django Filter.
We are currently doing the work to make sure those are just as seamless as we're making this ORM, as we're making this framework integration to be.
So that is on our work and that is on our roadmap for the months to come.
And if there is a third-party library, a third-party framework with Django that, as a developer, you find important, we are all ears.
We are listening, we are open, and we'd love to know sooner rather than later what those are so we can evaluate whether or not we can support them, because we most likely want to.
And so, like, to conclude those three points, it's like, we're still growing.
We're still evolving.
We're still learning.
I say this, yeah, I say this as the sort of lead engineer on this project.
We are here for real.
We are fully invested and with open ears.
And we are a team of enthusiastic Django developers writing code for MongoDB.
So don't ever think you can't just, you know, check out the MongoDB community forum or post something in the Django forums or even post something on Reddit and at MongoDB and let us know.
We've got folks just looking, waiting, checking, because we want to do this right and we want to do this permanently.
That's kind of like the recurring motif.
Yeah, no, I really support that because so many companies, they put a Django library out and they don't really do the job and they don't support it in the medium term.
And people get burnt by that.
So, you know, that you're here saying that Mongo is here for the medium and long term.
That's, you know, that's a really nice thing to hear.
Yeah, I mean, I see you also on the repo up.
You were just yesterday putting in commits.
So I think this is really great, I mean, selfishly for Django, right?
To have a powerful NoSQL story because people have used Mongo over the years.
But there hasn't been that all-in support the way there is now.
And so both, obviously, hopefully Django clearly makes sense for Mongo.
But, I mean, just for Django to be broader in terms of Python web all-encompassing, yeah, it opens up so many areas that, you know, people – one of the big reasons why people use Flask is because they want NoSQL and they feel like they can't do it in Django.
Like, I would say that's a top three, top five reason, I hear.
So that would be pretty powerful for Django if that shifts a bit.
And I can even add an anecdote myself.
So I've been doing a project, like I said, the project that I started off on MongoDB some time back with my friends.
And we were like, OK, we've been using Flask.
This is not like a production level site.
Let's actually, you know, get some authentication and paneling that we can stand behind.
First thing we look towards is like, OK, what does it take to move to Django?
And we ultimately had to pass because the upkeep for maintaining both MongoDB and Django at that time was not something that we as sort of ragtag group of developers felt like handling at the time.
And so this really feels like a full circle moment for me, as well as those same friends who I am currently messaging like, guys, look what I've done.
It's happening. It's happening.
and so it's like it's really a moment where we're rejoicing like wow like i wish we had this you
know eight years ago yeah no i mean again thinking about what i just said i think it's definitely a
top three reason why people choose flask like so why would someone choose flask they would say it's
more lightweight and you can get started on it faster there's been a bunch of work not not the
least carlton's given talks there's these nano django repos and stuff showing that you know hey
like you don't have to use all the batteries of Django right off the bat. No SQL support is a
huge one. Um, I can't think what the third is, but you know, to leverage what, what we have in this
community of the third party packages of the forum of all these things, like, um, I can't speak as
much to fast API, but, but flask, the whole point is it's a tool in part that you use with other
things. Whereas Django is an all-in-one code and community in a way that, you know, flask is,
it's not and that's not to take anything away from flask it's just they're different they're
different things different tools for different jobs um and yeah i i definitely resonate with the
django there's also a like you like when you're implementing something for django you're
implementing something for like clearly a community it was very like my first time at a django con
and i was like i was you know great in either ear just chatting with everyone yeah and i really
really look forward to going to another like um and and sort of being engaged because i understand
it's not just it's not just code that's being written it's like the impact to the communities
um the like the the workflows and how it's you know influencing so many things and how even
engaging with the community can help you know inform uh decisions itself so i like i i will
say like it's it's a full package deal and that's what i do really love about dj well i look forward
to your colleagues um at django con europe and i just want to mention you you you said something
that i've done a couple times and have been corrected by europeans which i've referred to
it as django con eu you know there is a difference apparently between the eu and europe especially
when we were in um scotland yeah two years ago so i uh i'm attuned to that because i was corrected
on it several times so just i'll just mention that to you it's it's jing kong europe not eu
thank you so much thank you so much i these things are important in these as a fellow american you
know it's like you know whatever over there less it less important but you know no thank you that
that is a very good clarification um uh great so jim i've got one one question we often ask people
um but i think it's it'd be really nice to ask you this this question because you've you've come to
django as somebody implementing and and you know no sql back end for the orm if you had a magic
wand and you could fix one thing in django what would it be if you know if django could make one
change what would what would your magic wand fix for magic or new feature or chain yeah um let's
see magic wand feature fixing for django oh um i would say that uh i would really like um
it's a tough it's a tough one sorry sorry sorry if we sprung it on you i mean it's
something we've gotten in the habit recently of asking guests um because yeah it's a tough one
and it may be now you've worked around all the bad things you don't want to change anything
because you've already worked around right right um i mean from i mean we could for example just
to get your brain going i think carlton i have talked about one thing would be to have when you
start a new django project some way that you can have like in brackets local or production to kind
of get over that cliff of you know because it defaults to local and then to get to production
is just a lot of steps um for anyone and i think that's off-putting to to people so that that seems
like something that could be done yeah actually that's that's a great point i'd say in that same
vein yeah um i think uh when we were like creating our quick start for um for the django project
i was like how quick can i make this quick start and and i feel like um yeah there were there were
several things that if there was a if there was a way to even um basically go from project to first
app in one command line argument that would be a uh that would be beautiful because right now
like you start project after you start project you start app and then and then you you then go in you
edit the app and then you also have to edit the main line to link urls so i there we go i found
my answer yeah you are all linked no no good behavior good so there is a depth in progress
a dango enhancement process in process to simplify the the default project and the tutorial for that
so to have a kind of single file thing so to start project and then that's already an app that you
would then carry on and create your view and things i mean and and to be and to give a shout
out um so eric mathis who's the author of a python crash course he has a package he's been working on
for a while, a repo called Django Simple Deploy that integrates a number of backends. I think
Heroku, Fly, Platform.sh to try to have one place to handle all this. He's been working on that a
lot over the last couple of years. There's also, I maintain a package called Lithium now that was
Django X. It doesn't have full deployment yet, but it just gets you started. It gives you Django
gives you crispy forms gives you a couple things so you can just get going i would check that out
it's basically a simpler um cookie cutter jango whatever that is called um so that's been around
for a while uh but if there's a you know eventually if there's a depth and there's an officially
supported way to do it uh i think it would get over that idea in people's head that jango is
like this beast that can't be tamed when you know it's customizable but we can have a an easier
path if you want to just get going with something yes carlton i i think as well part of the dep is
the idea to kind of promote the idea of um project templates to make them a bit more of a thing so
that you know so if you if i'm if you're working for mongo db you you can create a mongo db starter
project that they just go start project and then with the link and then it's kind of already got
the right bits set up and that would i think that's possible now but it's kind of hidden and
it's a bit arcane whereas if we can promote that a bit more maybe it become normalized that each
shop would have their own start project yeah yeah yeah and i mean it's just as a last thought it's
great to see a a database back end getting involved with django i mean if you look at who
sponsored things historically like conferences and other things it is there's been platform
providers so platform sh some other ones over the years and then consultancies there isn't really a
big next place so um mongo seems very well placed just just for that to slip in and also because
yeah like if we can just do away with the no sequel doesn't work on django story that's
that's such a huge win and yeah and to have you know headline another back end for the django rm
you know that's a major can we can we finally kick out oracle like let's just swap it oh yeah
okay you can put that to the board i know now that i'm not on the board i can just i was actually
asking some other stuff i can just be like it wouldn't it be nice if someone xyz um all right
so we're gonna link to the post to the the repo um if people do have suggestions around third-party
packages and other things what's the easiest way to reach out you mentioned you're available on
the forum on reddit um should they do an issue on the project or what would be the preferred way
so i'd say the preferred way right now um is we've got a if you like if you check the repo
we've got a section called issues and help um the the fastest response we'll give is if you provide
um feedback on the mongodb community forums or filing an issue or jira ticket um because that's
that's that'll get right to developers almost immediately um also definitely um like yeah
definitely uh if the django forum is the is the next sort of comfortable one i'd say go go but
yeah the mongol b community forum and um our issues and help section um providing a jira ticket
okay well thank you this is uh this this was a pretty amazing uh chat i think so and and timely
i'm sorry that it's going to wait two weeks to come out but that's okay well you know the other
things will come out in the on the news the django news newsletter and other things and then people
will be uh we'll have this to look forward to to have more explanation um and thank you thank you
for having me i i feel so like honored and humbled to be here it's awesome having you on a really
really really good chat so to wrap it up django chat.com uh links to everything in the show notes
and we'll see everyone next time bye-bye bye
This episode was brought to you by Button Down, the easiest way to start, send and grow your email newsletter.