WEBVTT

00:00.090 --> 00:00.540
All right.

00:00.540 --> 00:01.690
Next problem here.

00:01.820 --> 00:03.180
This is a little different.

00:03.330 --> 00:09.570
So as I mentioned earlier a real problem on Instagram are bot bot accounts that basically exist to go

00:09.570 --> 00:15.270
around and like a bunch of stuff or to comment on people's things but not create their own content they're

00:15.270 --> 00:18.690
not real people oftentimes are commenting the exact same thing.

00:18.870 --> 00:24.390
Or like if I post a photo I'll notice I get like 10 likes immediately like the next second.

00:24.570 --> 00:29.910
And usually there's not 10 people who are happening to you know to like it immediately there's some

00:29.910 --> 00:35.310
sort of program that's happening behind the scenes where it's looking for new posts and liking them

00:35.310 --> 00:36.440
immediately.

00:36.450 --> 00:39.930
So in our hypothetical Instagram phone we have the same problem.

00:39.960 --> 00:45.180
So what we want to do is just find all the users who we think could be bought their users who have never

00:45.250 --> 00:48.850
or excuse me their users who have liked every single photo on the site.

00:48.900 --> 00:52.050
So if we have 257 photos they liked all of them.

00:52.290 --> 00:57.510
If we have 800 photos if I liked all of them and I'm going to show you a way of doing it that actually

00:57.510 --> 00:58.990
use is something we haven't seen yet.

00:59.010 --> 01:00.570
Something new.

01:00.570 --> 01:05.020
So go ahead give it your own shot and then we'll have a solution.

01:05.310 --> 01:05.600
OK.

01:05.640 --> 01:11.670
So the first thing we want to do we know it's going to be a join tween users and like so we can start

01:11.670 --> 01:20.090
with this to select a star from users and we'll do an INNER JOIN like this.

01:20.100 --> 01:25.440
And the reason we know it's an inner join because we don't care about users who don't have any corresponding

01:25.440 --> 01:29.250
likes or likes or don't have a corresponding user that shouldn't exist.

01:29.250 --> 01:36.900
So we want the INNER JOIN on the overlap only so select start from users or join likes on users start

01:36.920 --> 01:40.350
ID calls like Dot use your ID.

01:40.520 --> 01:48.710
And if we start with that takes a moment you've got remember eight 8000 782 likes.

01:49.360 --> 01:54.580
And now what we want to do is group them based off of the user who did the liking.

01:54.910 --> 01:56.820
So we have a group by.

01:57.410 --> 01:59.810
And we have a bunch of choices of how we group them.

02:00.050 --> 02:08.440
We could do user ID or we can do like user ID will save.

02:09.290 --> 02:17.950
OK so now we have 77 rows here and we can see you know there's a bunch of information we've got the

02:17.990 --> 02:20.340
username we don't need created at.

02:20.370 --> 02:31.540
So let's them this down a bit let's do select user name User ID and we'll just keep it at that.

02:31.950 --> 02:32.340
OK.

02:32.580 --> 02:37.260
So these are all the users who have liked anything at all.

02:37.260 --> 02:38.060
We're grouping them.

02:38.100 --> 02:38.600
Right.

02:38.640 --> 02:42.080
So the groups by user id missing a comma.

02:42.150 --> 02:44.660
That's what's going on.

02:45.210 --> 02:47.990
Now what we want to do is figure out how many.

02:48.060 --> 02:53.430
So that's just a matter of using Count star which you should feel pretty good about now we'll give it

02:53.430 --> 02:54.950
an alias right now.

02:57.520 --> 03:02.620
OK and we can get rid of user ID as well and just to use your name.

03:03.580 --> 03:09.980
OK so we can see that you know we have Andre Purdey 85 has 94 likes.

03:10.300 --> 03:13.400
Let's give this actually numb like this.

03:14.130 --> 03:18.960
But the question was not just to figure out who has the most but to figure out the people who have liked

03:19.320 --> 03:20.910
every single photo.

03:21.360 --> 03:28.410
And you know if you don't remember how many we have 257 we're not talking about cheating.

03:28.530 --> 03:30.600
We could cheat if we wanted to check.

03:30.600 --> 03:32.840
Exactly for 257.

03:33.180 --> 03:37.660
It's not too bad but you might think you could just do like a Where where.

03:37.710 --> 03:40.790
Numb like equals to five 7.

03:41.220 --> 03:43.300
But that won't actually work.

03:43.740 --> 03:50.910
And that's because where clauses actually go before the group by telling it what the data you like to

03:50.920 --> 03:56.290
select to group versus what you'd like to select from the group's data.

03:56.520 --> 03:58.500
So where doesn't work.

03:58.560 --> 04:04.560
We need to use something different which we haven't seen which is called having and having X like where

04:04.950 --> 04:09.800
it's just a different word and what it will do is take our group's data.

04:09.810 --> 04:15.440
Our end result and allow us to filter based off of a clause so we can say having.

04:15.450 --> 04:18.990
And we want to say total equals 257.

04:19.350 --> 04:21.970
Just like that.

04:21.980 --> 04:26.880
So now if we do it up total I change the name to numb like.

04:26.960 --> 04:30.490
But now if I do it can see OK these are the people who have liked it.

04:30.530 --> 04:33.730
257 times like every single post.

04:33.890 --> 04:36.980
But the problem again is hard coded 2:57.

04:37.340 --> 04:39.610
So we could fix that very easily.

04:39.620 --> 04:41.120
We want this to be dynamic.

04:41.180 --> 04:50.820
However many photos we have and we can just use a subquery to select count star from photos.

04:50.840 --> 04:58.670
So this will be where the number of things it likes is equal to the total number of our photos and every

04:58.670 --> 05:00.100
go we get the same exact thing.

05:00.110 --> 05:07.070
But this time if I add a thousand photos or remove 10 photos this will still always be the total number

05:07.070 --> 05:07.950
of photos.

05:08.320 --> 05:08.810
OK.

05:09.020 --> 05:13.160
Kind of an ugly query with a subquery here but if you came up with something different.

05:13.160 --> 05:13.810
Post it.

05:13.880 --> 05:14.560
We'll have a discussion.
