WEBVTT

1
00:00:00.000 --> 00:00:01.260
In this lesson,

2
00:00:01.260 --> 00:00:04.710
we will learn about scaling considerations.

3
00:00:04.710 --> 00:00:08.400
Scalability is measured by the number of requests

4
00:00:08.400 --> 00:00:12.690
an asset or server can effectively support simultaneously.

5
00:00:12.690 --> 00:00:14.460
When the system or application

6
00:00:14.460 --> 00:00:17.130
can no longer handle additional requests,

7
00:00:17.130 --> 00:00:21.000
you have found the limits of that system's scalability.

8
00:00:21.000 --> 00:00:23.430
In traditional networks and servers,

9
00:00:23.430 --> 00:00:27.270
this occurs due to the limitation of processing capacity

10
00:00:27.270 --> 00:00:30.420
or the amount of physical memory the machine has,

11
00:00:30.420 --> 00:00:34.230
or even the amount of network bandwidth that is available.

12
00:00:34.230 --> 00:00:37.170
Now, if you begin to run out of resources

13
00:00:37.170 --> 00:00:39.960
when trying to support requests from users,

14
00:00:39.960 --> 00:00:41.610
then you need to scale up

15
00:00:41.610 --> 00:00:45.090
or scale out to increase those resources

16
00:00:45.090 --> 00:00:48.390
and be able to support the additional requests.

17
00:00:48.390 --> 00:00:51.690
Scaling up is known as vertical scaling.

18
00:00:51.690 --> 00:00:54.870
Scaling out is known as horizontal scaling.

19
00:00:54.870 --> 00:00:57.480
Let's take a look at each of these.

20
00:00:57.480 --> 00:01:00.803
Vertical scaling refers to adding more resources,

21
00:01:00.803 --> 00:01:05.460
such as more processing power, more memory, more storage,

22
00:01:05.460 --> 00:01:08.400
or more bandwidth to your existing machine.

23
00:01:08.400 --> 00:01:12.570
For example, in my laptop, I have 16 gigabytes of RAM.

24
00:01:12.570 --> 00:01:15.780
Now, if I find the system starts slowing down

25
00:01:15.780 --> 00:01:18.000
when I open really large files,

26
00:01:18.000 --> 00:01:20.970
I may need to increase my available memory,

27
00:01:20.970 --> 00:01:23.850
and I could do this by adding additional RAM

28
00:01:23.850 --> 00:01:27.720
to bring my total amount up to 32 gigabytes of RAM.

29
00:01:27.720 --> 00:01:29.640
This is a really simple example

30
00:01:29.640 --> 00:01:33.000
of scaling up or using vertical scaling.

31
00:01:33.000 --> 00:01:35.310
Vertical scaling is very popular

32
00:01:35.310 --> 00:01:37.680
in traditional systems and networks,

33
00:01:37.680 --> 00:01:41.640
and it can also be done when you're using cloud services.

34
00:01:41.640 --> 00:01:43.590
For example, let's pretend

35
00:01:43.590 --> 00:01:46.020
that you're going to start a new blog

36
00:01:46.020 --> 00:01:49.110
and you decide to use a simple cloud solution

37
00:01:49.110 --> 00:01:52.320
like AWS Lightsail to host that blog.

38
00:01:52.320 --> 00:01:56.370
Well, you might start out with the $5 per month plan.

39
00:01:56.370 --> 00:01:59.130
This plan might have one gigabyte of memory,

40
00:01:59.130 --> 00:02:02.820
a single core processor, 40 gigabytes of disc space,

41
00:02:02.820 --> 00:02:05.910
and two terabytes of data transfer each month.

42
00:02:05.910 --> 00:02:09.630
Now, over time, your blog gains more readers

43
00:02:09.630 --> 00:02:12.480
and you realize that you need to scale up

44
00:02:12.480 --> 00:02:13.830
or scale vertically.

45
00:02:13.830 --> 00:02:15.570
So, you click a button

46
00:02:15.570 --> 00:02:19.110
and you upgrade to the $20 per month plan.

47
00:02:19.110 --> 00:02:21.600
This gives you four gigabytes of memory

48
00:02:21.600 --> 00:02:23.970
instead of that one gigabyte of memory,

49
00:02:23.970 --> 00:02:26.160
increases your data transfer allowance,

50
00:02:26.160 --> 00:02:30.270
and doubles your processing power and disk storage space.

51
00:02:30.270 --> 00:02:34.680
This is vertical scalability or scaling up.

52
00:02:34.680 --> 00:02:37.170
Now, you can keep scaling up vertically

53
00:02:37.170 --> 00:02:39.870
until you reach the largest plan size.

54
00:02:39.870 --> 00:02:43.740
And let's say that plan gives you 32 gigabytes of memory

55
00:02:43.740 --> 00:02:48.210
and 8 core processor, 640 gigabytes of disk space,

56
00:02:48.210 --> 00:02:51.480
and 7 terabytes of data transfer per month.

57
00:02:51.480 --> 00:02:55.350
So, what do you do when you can't vertically scale anymore?

58
00:02:55.350 --> 00:02:58.560
Well, you're going to need to re-architect your website

59
00:02:58.560 --> 00:03:03.510
to allow for horizontal scaling, also known as scaling out.

60
00:03:03.510 --> 00:03:06.000
Horizontal scaling is not as easy

61
00:03:06.000 --> 00:03:08.280
as just adding more resources.

62
00:03:08.280 --> 00:03:09.990
Horizontal scaling forces you

63
00:03:09.990 --> 00:03:13.950
to break down your workloads into smaller pieces of logic

64
00:03:13.950 --> 00:03:18.180
that can be executed in parallel across multiple machines.

65
00:03:18.180 --> 00:03:20.670
For example, if you are using a database

66
00:03:20.670 --> 00:03:24.390
to run your website, horizontal scaling will allow you

67
00:03:24.390 --> 00:03:27.870
to partition the data across multiple databases,

68
00:03:27.870 --> 00:03:32.430
so that each database only contains some of that whole data.

69
00:03:32.430 --> 00:03:34.860
So, let's think back to that blog.

70
00:03:34.860 --> 00:03:37.170
You might have a collection of thousands

71
00:03:37.170 --> 00:03:40.770
of different articles split across multiple servers,

72
00:03:40.770 --> 00:03:44.370
one server for each year of articles that you wrote.

73
00:03:44.370 --> 00:03:47.700
So, when somebody wants to read one of those articles,

74
00:03:47.700 --> 00:03:50.850
the request will get answered by the corresponding server

75
00:03:50.850 --> 00:03:53.100
and database based on the year

76
00:03:53.100 --> 00:03:55.590
of the article that was requested.

77
00:03:55.590 --> 00:03:56.910
This is a good example

78
00:03:56.910 --> 00:04:00.090
of breaking down a workload into smaller pieces.

79
00:04:00.090 --> 00:04:03.330
Once workloads are broken down into smaller pieces,

80
00:04:03.330 --> 00:04:05.490
scaling out can occur.

81
00:04:05.490 --> 00:04:08.940
In general, scaling out is better for the long-term,

82
00:04:08.940 --> 00:04:11.700
because adding an individual machine

83
00:04:11.700 --> 00:04:15.120
is cheaper than adding resources to a single machine.

84
00:04:15.120 --> 00:04:18.240
Also, scaling out is virtually limitless,

85
00:04:18.240 --> 00:04:21.720
because you can keep adding smaller, inexpensive machines

86
00:04:21.720 --> 00:04:23.550
to meet the increasing workload

87
00:04:23.550 --> 00:04:27.630
that's being caused by additional users and their requests.

88
00:04:27.630 --> 00:04:30.570
By focusing on horizontal scaling,

89
00:04:30.570 --> 00:04:33.990
your resources become infinitely elastic,

90
00:04:33.990 --> 00:04:37.470
and the only limit becomes your ability to pay

91
00:04:37.470 --> 00:04:41.340
for cloud computing time from your cloud service provider.

92
00:04:41.340 --> 00:04:44.460
By using horizontal scaling in this manner,

93
00:04:44.460 --> 00:04:48.240
you can have instant and continuous availability,

94
00:04:48.240 --> 00:04:50.700
no limit to hardware capacity,

95
00:04:50.700 --> 00:04:52.500
cost that's going to be assessed

96
00:04:52.500 --> 00:04:55.140
on a per use basis in the cloud,

97
00:04:55.140 --> 00:04:59.280
built-in redundancy, and it's going to be easier to size

98
00:04:59.280 --> 00:05:03.000
and resize your infrastructure to meet your needs.

99
00:05:03.000 --> 00:05:05.640
But the challenge with horizontal scaling

100
00:05:05.640 --> 00:05:08.340
is that you first need to design your applications

101
00:05:08.340 --> 00:05:11.280
to work in smaller self-contained blocks

102
00:05:11.280 --> 00:05:14.640
that interact to create the results you need.

103
00:05:14.640 --> 00:05:16.680
To do this, your application

104
00:05:16.680 --> 00:05:20.040
must be designed as a stateless application.

105
00:05:20.040 --> 00:05:23.910
Stateless applications do not store any session data

106
00:05:23.910 --> 00:05:26.940
or information about the user's interactions

107
00:05:26.940 --> 00:05:28.680
on the server itself.

108
00:05:28.680 --> 00:05:33.420
Instead, each request from a user is treated independently

109
00:05:33.420 --> 00:05:36.270
with no reliance on previous requests.

110
00:05:36.270 --> 00:05:38.760
This design allows an application

111
00:05:38.760 --> 00:05:42.450
to be easily distributed across multiple servers

112
00:05:42.450 --> 00:05:43.434
as no server

113
00:05:43.434 --> 00:05:48.030
needs to retain information about user sessions or state.

114
00:05:48.030 --> 00:05:51.720
Additionally, if there is any necessary session data,

115
00:05:51.720 --> 00:05:55.230
it can be stored externally in a database or cache,

116
00:05:55.230 --> 00:05:56.670
allowing any server

117
00:05:56.670 --> 00:06:00.480
to access that session data and handle a request.

118
00:06:00.480 --> 00:06:03.960
This makes horizontal scaling much simpler,

119
00:06:03.960 --> 00:06:06.330
because you can add or remove servers

120
00:06:06.330 --> 00:06:09.090
without disrupting the user experience

121
00:06:09.090 --> 00:06:11.580
or losing important session data.

122
00:06:11.580 --> 00:06:16.110
So, remember, scalability refers to the ability

123
00:06:16.110 --> 00:06:19.230
of a system to handle increasing demand

124
00:06:19.230 --> 00:06:21.930
by effectively managing resources.

125
00:06:21.930 --> 00:06:24.750
Vertical scaling or scaling up

126
00:06:24.750 --> 00:06:27.930
involves adding more resources such as memory

127
00:06:27.930 --> 00:06:29.430
or processing power

128
00:06:29.430 --> 00:06:32.670
to a single machine to improve its capacity.

129
00:06:32.670 --> 00:06:35.820
However, vertical scaling has limits,

130
00:06:35.820 --> 00:06:37.710
and once those limits are reached,

131
00:06:37.710 --> 00:06:41.400
horizontal scaling or scaling out becomes necessary.

132
00:06:41.400 --> 00:06:44.220
Horizontal scaling involves spreading workloads

133
00:06:44.220 --> 00:06:46.170
across multiple machines

134
00:06:46.170 --> 00:06:48.810
and typically involves designing applications

135
00:06:48.810 --> 00:06:51.330
to be stateless, meaning that each request

136
00:06:51.330 --> 00:06:53.070
is handled independently

137
00:06:53.070 --> 00:06:56.313
without relying on previous interactions.

