WEBVTT

00:00.510 --> 00:00.970
Hello.

00:00.990 --> 00:01.870
Welcome back.

00:02.070 --> 00:05.040
So this section is going to be a bit of a capstone.

00:05.340 --> 00:06.750
Maybe that's an exaggeration.

00:06.750 --> 00:11.720
It's going to be a summary of what we've been focusing on in the last couple of sections.

00:12.000 --> 00:17.490
So there's sort of this subsection of the course the last three or four sections that we're all focused

00:17.490 --> 00:24.780
on working with multiple tables relationships between data tables where we have foreign keys referencing

00:24.840 --> 00:31.500
other tables talking about relationships or associations like a one to many relationship or many to

00:31.500 --> 00:33.000
many relationship and so on.

00:33.090 --> 00:36.790
And then also joints which is what we are focusing on as of late.

00:36.810 --> 00:40.360
So this is going to kind of wrap that all together into something bigger.

00:40.530 --> 00:47.940
Because so far we've really been working with two maybe three tables modeling a single relationship

00:48.090 --> 00:53.220
like customers and orders but now we're going to take something out a little bit bigger.

00:53.430 --> 00:56.390
We're going to focus on a bit of a case study.

00:56.490 --> 01:01.770
We're going to work with a site that you're probably familiar with Instagram and we're going to try

01:01.800 --> 01:08.390
and clone the database for some of the basic functionality so we're not going to be diving into like

01:08.430 --> 01:12.150
advertising and keeping track of our users data and all that.

01:12.180 --> 01:18.450
We're really focusing on the main big entities to a site like Instagram things like photos and comments

01:18.450 --> 01:20.160
and users and so on.

01:20.160 --> 01:25.380
So again we're not really scratching the surface as to how Instagram actually stores everything.

01:25.380 --> 01:31.680
We'll talk about what things work best for scaling or run into some fun issues when we talk about how

01:31.680 --> 01:34.160
hash tags work and how you should store hash tags.

01:34.200 --> 01:39.420
So we will talk a little bit about performance but this is really more about how do you design a schema

01:39.480 --> 01:44.940
for something that has in our example at least seven or eight tables in Instagram.

01:44.940 --> 01:50.140
The real app there's probably dozens of tables but either way it's a big step up from two tables or

01:50.140 --> 01:51.620
three tables Max.

01:51.810 --> 01:56.040
And then the other thing I want to mention is that in this section we're going to focus on designing

01:56.040 --> 01:56.660
the schema.

01:56.850 --> 02:01.010
It's going to be a bit of an exercise but I'm also going to go through and do it myself.

02:01.320 --> 02:07.170
But then in the next section I'm actually going to give you a bunch of data a ton of data.

02:07.170 --> 02:09.410
I spent a long time making this here.

02:09.480 --> 02:14.100
So this is what we'll be working with you actually be inserting this so that for once we'll be able

02:14.100 --> 02:19.380
to work with big amounts of data thousands of rows and you'll see the impact that it has.

02:19.380 --> 02:24.880
And also hopefully it will feel a bit more realistic rather than you know two or three things at a time.

02:24.900 --> 02:25.800
So that's where we're going.

02:25.860 --> 02:28.710
But in this section we got to create the schema first.

02:28.770 --> 02:30.400
We got to understand what we need.

02:30.630 --> 02:35.420
So let's start by taking a look at typical Instagram page.

02:35.580 --> 02:37.570
And yeah some of you may know this.

02:37.650 --> 02:43.020
I'm in my spare time I like to do photography especially landscape photography stuff around San Francisco

02:43.320 --> 02:49.020
so I am using my own Instagram here but that's not really why I'm not trying to get more exposure or

02:49.020 --> 02:49.880
anything.

02:49.980 --> 02:54.370
It's just that I don't want to run into any issues with you know permission or using or having the right

02:54.390 --> 02:56.880
to use somebody's images in my course.

02:57.210 --> 02:57.660
OK.

02:57.780 --> 03:02.870
So with that said the reason this is here like I said is not just to show you my photo but mainly it's

03:03.030 --> 03:06.340
to help us understand and identify the key entities.

03:06.390 --> 03:08.330
So on this basic Instagram.

03:08.430 --> 03:09.860
This is from the web app.

03:10.110 --> 03:12.720
For the most people use the iOS or Android app.

03:12.730 --> 03:15.030
It's the same entity the same data.

03:15.510 --> 03:19.230
And the first thing probably the most obvious is users.

03:19.530 --> 03:24.400
So we're going to need some sort of users table or a way to store users and then moving on.

03:24.600 --> 03:29.120
We've got a really obvious one which are the photos we need a way of storing images or photos.

03:29.310 --> 03:34.380
Now we're not going to like worry about how we actually upload photos and how you know what data type

03:34.380 --> 03:36.380
we use to store them or anything like that.

03:36.450 --> 03:38.600
We'll just say that it's an image you are.

03:38.790 --> 03:42.240
So when we actually get to the exercise portion I'll come back to that.

03:42.330 --> 03:47.740
So we need to store stuff about photos and then we've got things like likes.

03:47.820 --> 03:50.460
So we need to store likes for a given photo.

03:50.610 --> 03:56.880
But also remember there are some complications because you know I can't just click on this heart button

03:57.600 --> 04:00.120
you know 10 times and add 10 likes.

04:00.270 --> 04:01.830
I can only like it once.

04:01.860 --> 04:04.620
So we need to have something there as well.

04:04.800 --> 04:07.200
Maybe there's something we can do with our database.

04:07.310 --> 04:13.290
And and then we've got hash tags and hash tags are fun.

04:13.350 --> 04:14.730
We're going to save that for last.

04:14.730 --> 04:18.430
In this section because there's a lot of ways of doing them.

04:18.540 --> 04:23.160
We'll talk about at least three and there's a couple of advantages and disadvantages to each.

04:23.250 --> 04:29.160
So hashtags though are important part of Instagram and of a lot of tagging in general is a lot of web

04:29.160 --> 04:33.460
apps whether it's blogging or Twitter or something like Instagram.

04:33.690 --> 04:39.580
And then we also have other components that I couldn't fit onto one page like comments here.

04:39.800 --> 04:45.110
That's really the big thing that we also need to be able to store comment information.

04:45.240 --> 04:50.820
And then there's one other big thing which is followers and following.

04:50.820 --> 04:52.990
So the relationships between users.

04:53.040 --> 04:54.860
How do we store that.

04:54.960 --> 04:57.180
You know there's a couple ways of doing that as well.

04:57.450 --> 05:02.410
But on Instagram the way that it works and the way that our database should work is that it's a one

05:02.410 --> 05:03.700
way relationship.

05:03.700 --> 05:08.950
It's not like Facebook where you know I send a request to someone and we only become friends if they

05:08.950 --> 05:11.110
accept or vice versa.

05:11.200 --> 05:15.580
But both of us have to be kind of consenting friends on Instagram.

05:15.580 --> 05:19.570
I can follow somebody and they don't have to phone me back or people can follow me and I don't have

05:19.570 --> 05:20.980
to follow them back.

05:20.980 --> 05:23.250
So how do we store that database.

05:23.580 --> 05:25.360
OK so that's really what we're focusing on.

05:25.360 --> 05:29.900
There are obviously other things like we're not going to store profile information.

05:30.040 --> 05:32.430
We're not going to store stuff about advertising.

05:32.430 --> 05:38.220
Like I said and that's just scratching the surface of what we're not storing there's a bunch more but

05:38.230 --> 05:43.950
we are going to store information and I'm not saying these are the tables we need we may need more tables

05:43.960 --> 05:48.910
we may be able to get away with fewer tables but these are the entities information we need to store

05:48.970 --> 05:59.200
so users photos comments on photos likes for photos as well hashtags which will apply to photos we're

05:59.200 --> 06:00.870
not going to worry about hashtags.

06:01.060 --> 06:06.160
Instead of comments which you can do on the real Instagram we're not going to do that.

06:06.250 --> 06:09.910
We can talk about how but it just makes it a lot in this year.

06:09.910 --> 06:16.870
And then also followers follow these kind of difficult to talk about the different roles there but just

06:16.870 --> 06:20.330
relationships or friendships or whatever you want to call it.

06:20.440 --> 06:23.520
I just called followers and followers or followings.

06:23.530 --> 06:25.580
So these are the main things you need to store.

06:25.780 --> 06:31.600
And the way that I'd like to structure this is I would like you to try and give this a shot so you don't

06:31.600 --> 06:36.510
have to actually you know insert data and do a bunch of exercises on your own.

06:36.520 --> 06:40.200
But I'd like for you to brainstorm how you would do this.

06:40.200 --> 06:41.610
What are the tables you need.

06:41.890 --> 06:47.230
I'm going to you know in the next in the ensuing six or seven videos go through each table we need and

06:47.230 --> 06:52.690
talk about my solution but it would be great if you would take the time to just kind of think about

06:52.690 --> 06:52.830
it.

06:52.840 --> 06:54.010
Jot it down.

06:54.010 --> 06:59.410
Either draw diagrams out create a scheme of file and just write the create table statements whatever

06:59.410 --> 07:03.780
you feel most comfortable with if you want to get a white board or Post-it notes whatever works.

07:03.790 --> 07:07.120
But think about how you would do this if you were creating.

07:07.150 --> 07:11.050
It doesn't matter if you're a developer if you know how to code in other languages or not.

07:11.050 --> 07:16.750
But if you were creating an application like Instagram with the functionality we described what would

07:16.750 --> 07:18.810
you sit down and start with for your database.

07:18.820 --> 07:20.400
And how would you brainstorm.

07:20.650 --> 07:22.450
So that's really what I'd like you to do.

07:22.630 --> 07:27.370
And then in the next couple of videos I'll step through in bite sized chunks.

07:27.370 --> 07:33.760
I want to make sure that these are short videos which I know I have said before and maybe have not held

07:33.760 --> 07:39.700
up my end of the bargain all the time but these will be shorter and they'll go through my thought process

07:39.700 --> 07:40.930
and how I created it.

07:41.000 --> 07:46.420
Oh and before I forget if you're going to attempt this when you're creating the tables that you need

07:46.420 --> 07:47.880
to store the type of information.

07:48.010 --> 07:50.100
Just focus on the core essentials.

07:50.110 --> 07:56.220
So for something like users you don't have to be you know have a username and an email and a password

07:56.230 --> 08:02.470
and I don't know what else extremum asked for like a birthday maybe in a profile and all of that you

08:02.470 --> 08:08.290
don't need that just the basics maybe like username and whatever else you need to have at work or have

08:08.290 --> 08:12.250
it really or associate with all the other entities.

08:12.570 --> 08:18.250
It's easy to you know add the individual things to a table like for users adding an email column.

08:18.250 --> 08:24.100
It's not hard but then figuring out how you have users connected to likes for instance.

08:24.100 --> 08:25.680
That's what I'd like you to focus on.

08:25.930 --> 08:26.170
OK.

08:26.170 --> 08:31.180
So again you don't need all the information you can possibly think of just the key parts.
