WEBVTT

1
00:00:00.000 --> 00:00:01.920
In this section of the course,

2
00:00:01.920 --> 00:00:05.520
we are going to discuss data security concepts.

3
00:00:05.520 --> 00:00:08.520
The data security concept section of the course

4
00:00:08.520 --> 00:00:11.850
focuses on domain three, security engineering,

5
00:00:11.850 --> 00:00:14.640
specifically objective 3.8,

6
00:00:14.640 --> 00:00:16.800
which states that given a scenario,

7
00:00:16.800 --> 00:00:21.120
you must be able to apply appropriate cryptographic use case

8
00:00:21.120 --> 00:00:22.740
and/or techniques.

9
00:00:22.740 --> 00:00:25.830
Ensuring the accuracy and security of information

10
00:00:25.830 --> 00:00:28.650
through data security concepts is crucial

11
00:00:28.650 --> 00:00:32.610
in protecting sensitive information from unauthorized access

12
00:00:32.610 --> 00:00:34.650
and ensuring its integrity.

13
00:00:34.650 --> 00:00:38.790
Data security concepts not only involves safeguarding data

14
00:00:38.790 --> 00:00:41.130
as it moves through various stages,

15
00:00:41.130 --> 00:00:45.150
but also implementing measures to uphold its reliability

16
00:00:45.150 --> 00:00:47.520
and adhere to privacy standards.

17
00:00:47.520 --> 00:00:50.250
Furthermore, effective and mature management

18
00:00:50.250 --> 00:00:53.820
and compliance strategies help maintain data integrity

19
00:00:53.820 --> 00:00:55.860
and protect sensitive information

20
00:00:55.860 --> 00:00:59.010
across different systems and applications.

21
00:00:59.010 --> 00:01:00.690
As we go through this section,

22
00:01:00.690 --> 00:01:04.350
we'll cover many topics related to data security concepts,

23
00:01:04.350 --> 00:01:09.030
including data integrity, integrity use cases, blockchain,

24
00:01:09.030 --> 00:01:11.970
data protection, data state protection,

25
00:01:11.970 --> 00:01:13.740
data handling and management,

26
00:01:13.740 --> 00:01:16.530
and data compliance and privacy.

27
00:01:16.530 --> 00:01:19.200
First, we will look at data integrity.

28
00:01:19.200 --> 00:01:22.800
Data integrity ensures the accuracy, consistency,

29
00:01:22.800 --> 00:01:25.770
and reliability of data throughout its lifecycle.

30
00:01:25.770 --> 00:01:29.160
Data integrity ensures that data remains unaltered

31
00:01:29.160 --> 00:01:31.380
without permission and is important

32
00:01:31.380 --> 00:01:32.940
from the moment it is created

33
00:01:32.940 --> 00:01:35.700
as well as in its storage and retrieval.

34
00:01:35.700 --> 00:01:39.840
Hashing is a technique used to validate data integrity.

35
00:01:39.840 --> 00:01:44.280
Hashing algorithms utilize one-way cryptographic functions

36
00:01:44.280 --> 00:01:48.480
to convert any size input into a fixed-size output.

37
00:01:48.480 --> 00:01:51.450
The output is known as a hash value.

38
00:01:51.450 --> 00:01:55.020
A hash value uniquely represents the original data

39
00:01:55.020 --> 00:01:57.300
and cannot be reverse engineered

40
00:01:57.300 --> 00:01:59.370
to reveal the original data.

41
00:01:59.370 --> 00:02:04.290
Important hashing algorithms include SHA, MD5,

42
00:02:04.290 --> 00:02:08.790
and RACE Integrity Primitives Evaluation Message Digest,

43
00:02:08.790 --> 00:02:10.440
or RIPEMD.

44
00:02:10.440 --> 00:02:14.700
Overall, hashing algorithms are crucial in cryptography

45
00:02:14.700 --> 00:02:17.250
for generating key material in ciphers,

46
00:02:17.250 --> 00:02:19.950
creating and verifying digital signatures,

47
00:02:19.950 --> 00:02:21.960
securely storing passwords,

48
00:02:21.960 --> 00:02:26.370
and ensuring data integrity by producing unique hash values

49
00:02:26.370 --> 00:02:30.990
that verify the authenticity of data and detect alterations.

50
00:02:30.990 --> 00:02:34.800
Next, we will explore integrity use cases.

51
00:02:34.800 --> 00:02:37.110
Use cases related to integrity,

52
00:02:37.110 --> 00:02:40.560
involve maintaining the accuracy and trustworthiness

53
00:02:40.560 --> 00:02:42.300
of data or software

54
00:02:42.300 --> 00:02:45.090
to ensure its functionality and security.

55
00:02:45.090 --> 00:02:49.140
Specific use cases include knowing the software providence

56
00:02:49.140 --> 00:02:51.630
as well as software code integrity.

57
00:02:51.630 --> 00:02:54.480
Let's explore these concepts in more detail.

58
00:02:54.480 --> 00:02:57.330
Tracking and verifying the software providence

59
00:02:57.330 --> 00:03:00.660
ensures the origin and history of software components

60
00:03:00.660 --> 00:03:02.580
are known and secure.

61
00:03:02.580 --> 00:03:05.407
Software providence helps answer the question,

62
00:03:05.407 --> 00:03:08.670
"Where did this software come from and can I trust it?"

63
00:03:08.670 --> 00:03:12.270
Software and code integrity ensures that software and code

64
00:03:12.270 --> 00:03:16.050
have not been altered from their original and trusted state.

65
00:03:16.050 --> 00:03:19.710
For example, a company developing a software application

66
00:03:19.710 --> 00:03:21.480
can use providence tracking

67
00:03:21.480 --> 00:03:24.300
to document the entire development process,

68
00:03:24.300 --> 00:03:27.990
including who wrote each piece of code, when it was written,

69
00:03:27.990 --> 00:03:30.330
and what changes have been made over time.

70
00:03:30.330 --> 00:03:34.260
Alongside this, the company may implement integrity checks

71
00:03:34.260 --> 00:03:38.100
through cryptographic hashes to verify the code integrity.

72
00:03:38.100 --> 00:03:41.220
By combining these approaches, a company can ensure

73
00:03:41.220 --> 00:03:44.400
not only that the code came from a trusted source

74
00:03:44.400 --> 00:03:46.860
and followed an auditable development path,

75
00:03:46.860 --> 00:03:49.800
but also that it has not been tampered with.

76
00:03:49.800 --> 00:03:52.710
After that, we will look at blockchain.

77
00:03:52.710 --> 00:03:57.330
Blockchain is a decentralized distributed ledger technology

78
00:03:57.330 --> 00:04:01.620
that ensures data integrity and transaction transparency

79
00:04:01.620 --> 00:04:05.640
by recording transactions in a series of immutable blocks.

80
00:04:05.640 --> 00:04:08.430
Immutable means that the blocks are unchangeable

81
00:04:08.430 --> 00:04:09.750
after creation.

82
00:04:09.750 --> 00:04:11.280
An immutable database

83
00:04:11.280 --> 00:04:14.490
refers to a database where data, once written,

84
00:04:14.490 --> 00:04:17.430
is not able to be altered or deleted.

85
00:04:17.430 --> 00:04:20.040
Immutable databases ensure a permanent record

86
00:04:20.040 --> 00:04:22.020
of blockchain transaction.

87
00:04:22.020 --> 00:04:24.750
For example, a blockchain based system

88
00:04:24.750 --> 00:04:27.420
can be used to track supply chain transactions

89
00:04:27.420 --> 00:04:31.230
where each transaction is recorded in an immutable ledger.

90
00:04:31.230 --> 00:04:34.320
This provides a transparent and tamper-proof history

91
00:04:34.320 --> 00:04:37.170
of goods from origin to destination.

92
00:04:37.170 --> 00:04:39.840
Next, we will explore data protection.

93
00:04:39.840 --> 00:04:43.170
Data protection, utilizes methods to safeguard data

94
00:04:43.170 --> 00:04:47.070
from unauthorized access, alteration, or loss,

95
00:04:47.070 --> 00:04:51.750
ensuring its confidentiality, integrity, and availability.

96
00:04:51.750 --> 00:04:54.510
Data protection concepts include techniques

97
00:04:54.510 --> 00:04:59.340
such as tokenization, cryptographic erase, obfuscation,

98
00:04:59.340 --> 00:05:03.060
cryptographic obfuscation, and serialization.

99
00:05:03.060 --> 00:05:06.660
Let's pause a moment to discuss each of these concepts.

100
00:05:06.660 --> 00:05:10.470
The process of tokenization replaces sensitive data

101
00:05:10.470 --> 00:05:13.257
with unique identifiers or tokens.

102
00:05:13.257 --> 00:05:16.260
With tokenization, tokens can only be mapped back

103
00:05:16.260 --> 00:05:19.290
to their original data through a secure system

104
00:05:19.290 --> 00:05:22.410
preventing the exposure of sensitive information.

105
00:05:22.410 --> 00:05:25.830
Cryptographic erase is the secure deletion of data

106
00:05:25.830 --> 00:05:28.500
through encryption and subsequent destruction

107
00:05:28.500 --> 00:05:32.430
of the encryption keys, making the data irrecoverable.

108
00:05:32.430 --> 00:05:35.130
Obfuscation is used to change data

109
00:05:35.130 --> 00:05:36.870
to hide its original meaning,

110
00:05:36.870 --> 00:05:40.980
protecting it from unauthorized access or analysis.

111
00:05:40.980 --> 00:05:44.100
Obfuscation on its own may involve encoding

112
00:05:44.100 --> 00:05:45.720
to change the format of data,

113
00:05:45.720 --> 00:05:48.720
but does not utilize cryptographic algorithms.

114
00:05:48.720 --> 00:05:51.480
Cryptographic obfuscation, on the other hand,

115
00:05:51.480 --> 00:05:54.600
transforms data into an unreadable format

116
00:05:54.600 --> 00:05:56.820
using encryption algorithms.

117
00:05:56.820 --> 00:05:59.460
Cryptographic obfuscation makes it difficult

118
00:05:59.460 --> 00:06:03.210
for unauthorized parties to interpret or use the data.

119
00:06:03.210 --> 00:06:06.600
Finally, serialization is used to convert data

120
00:06:06.600 --> 00:06:10.350
into a format suitable for storage or transmission,

121
00:06:10.350 --> 00:06:13.860
while preserving its ability to be accurately reconstructed

122
00:06:13.860 --> 00:06:15.240
at a later time.

123
00:06:15.240 --> 00:06:17.850
In practice, an e-commerce organization

124
00:06:17.850 --> 00:06:21.570
might use tokenization to replace credit card information

125
00:06:21.570 --> 00:06:24.690
with tokens to protect the sensitive cardholder data.

126
00:06:24.690 --> 00:06:27.600
During data use, obfuscation may be used

127
00:06:27.600 --> 00:06:29.370
to conceal cardholder data.

128
00:06:29.370 --> 00:06:32.250
When in storage, serialization may be used

129
00:06:32.250 --> 00:06:35.340
to store the sensitive cardholder data securely

130
00:06:35.340 --> 00:06:38.520
and ensure that it can be reconstructed later.

131
00:06:38.520 --> 00:06:41.130
When the cardholder data is no longer required

132
00:06:41.130 --> 00:06:44.970
to be held in storage, cryptographic erase can be used

133
00:06:44.970 --> 00:06:48.360
to ensure it is securely deleted from storage.

134
00:06:48.360 --> 00:06:51.690
Following that, we will look at data state protection.

135
00:06:51.690 --> 00:06:54.510
Data state protection, safeguards data

136
00:06:54.510 --> 00:06:56.430
based on its current condition,

137
00:06:56.430 --> 00:06:59.250
whether it is stored, being transmitted,

138
00:06:59.250 --> 00:07:01.230
or being actively used.

139
00:07:01.230 --> 00:07:05.100
Data state protection concepts include data at rest,

140
00:07:05.100 --> 00:07:08.880
data in transit, and data in use, or in processing.

141
00:07:08.880 --> 00:07:12.270
Let's discuss each of these concepts in greater detail.

142
00:07:12.270 --> 00:07:15.000
The data at rest refers to data that is stored

143
00:07:15.000 --> 00:07:18.060
on physical or cloud-based storage systems.

144
00:07:18.060 --> 00:07:21.630
While in storage, data may be protected using encryption

145
00:07:21.630 --> 00:07:25.770
and access controls to prevent unauthorized access.

146
00:07:25.770 --> 00:07:28.890
Next, data in transit is defined as data

147
00:07:28.890 --> 00:07:32.010
that is being actively transmitted over networks.

148
00:07:32.010 --> 00:07:35.370
Data in transit is secured by using encrypted tunnels

149
00:07:35.370 --> 00:07:38.370
to protect it from interception and tampering.

150
00:07:38.370 --> 00:07:42.030
Next, data in use or data in processing

151
00:07:42.030 --> 00:07:44.520
is data that is actively being accessed

152
00:07:44.520 --> 00:07:47.100
or manipulated by applications.

153
00:07:47.100 --> 00:07:51.090
Data in use or processing is located in a system's RAM

154
00:07:51.090 --> 00:07:52.890
or CPU registers.

155
00:07:52.890 --> 00:07:55.560
Data in use may be secured through techniques

156
00:07:55.560 --> 00:07:57.330
like in-memory encryption.

157
00:07:57.330 --> 00:08:00.540
In-memory encryption is the process of encrypting data

158
00:08:00.540 --> 00:08:03.510
while it resides in use or processing.

159
00:08:03.510 --> 00:08:05.610
Let's take a look at how an organization

160
00:08:05.610 --> 00:08:08.550
can protect its data in each of these states.

161
00:08:08.550 --> 00:08:11.910
The organization might use the Advanced Encryption Standard

162
00:08:11.910 --> 00:08:15.960
or AES to encrypt data at rest on its servers,

163
00:08:15.960 --> 00:08:19.200
ensuring that stored files and databases are secure

164
00:08:19.200 --> 00:08:21.240
from unauthorized access.

165
00:08:21.240 --> 00:08:24.630
The organization may then employ Transport Layer Security

166
00:08:24.630 --> 00:08:29.130
or TLS to protect its data in transit across the network,

167
00:08:29.130 --> 00:08:31.890
ensuring that data transmitted over the internet

168
00:08:31.890 --> 00:08:34.380
or internal networks remains encrypted

169
00:08:34.380 --> 00:08:36.540
and secure during transfer.

170
00:08:36.540 --> 00:08:39.090
Finally, the organization might implement

171
00:08:39.090 --> 00:08:41.310
in-memory encryption such as

172
00:08:41.310 --> 00:08:44.340
the CPU Secure Memory Encryption feature

173
00:08:44.340 --> 00:08:47.940
to protect data while it is being processed by applications,

174
00:08:47.940 --> 00:08:51.390
ensuring that sensitive information remains protected

175
00:08:51.390 --> 00:08:54.390
even when temporarily held in RAM.

176
00:08:54.390 --> 00:08:57.870
Next, we will explore data handling and management.

177
00:08:57.870 --> 00:09:01.290
Data handling and management is used to safeguard data

178
00:09:01.290 --> 00:09:03.510
throughout its entire lifecycle

179
00:09:03.510 --> 00:09:07.920
to ensure its confidentiality, integrity, and availability.

180
00:09:07.920 --> 00:09:09.930
Data handling and management concepts

181
00:09:09.930 --> 00:09:14.160
include data anonymization and data sanitization.

182
00:09:14.160 --> 00:09:16.710
Data anonymization is modifying data

183
00:09:16.710 --> 00:09:20.580
to prevent the identification of individual data owners

184
00:09:20.580 --> 00:09:23.640
while maintaining its ability to be analyzed.

185
00:09:23.640 --> 00:09:25.740
Anonymized data remains secret

186
00:09:25.740 --> 00:09:29.190
even if the remaining data is exposed or shared.

187
00:09:29.190 --> 00:09:33.060
For instance, removing personally identifiable information

188
00:09:33.060 --> 00:09:36.660
or PII from a dataset, anonymizes the data

189
00:09:36.660 --> 00:09:38.850
and allows researchers to use the data

190
00:09:38.850 --> 00:09:41.250
without risking privacy breaches.

191
00:09:41.250 --> 00:09:43.650
Data sanitization on the other hand,

192
00:09:43.650 --> 00:09:46.380
is the process of securely destroying data

193
00:09:46.380 --> 00:09:50.430
to ensure that sensitive information is completely removed

194
00:09:50.430 --> 00:09:54.150
before the storage media is disposed of or repurposed.

195
00:09:54.150 --> 00:09:57.810
For example, an organization may use Microsoft's

196
00:09:57.810 --> 00:10:01.770
Sysinternals SDelete tool to securely destroy files

197
00:10:01.770 --> 00:10:04.590
by overriding the data with random patterns,

198
00:10:04.590 --> 00:10:06.450
making it irrecoverable.

199
00:10:06.450 --> 00:10:10.080
Finally, we will look at data compliance and privacy.

200
00:10:10.080 --> 00:10:12.270
Data compliance and privacy involves

201
00:10:12.270 --> 00:10:15.570
adhering to legal and regulatory requirements

202
00:10:15.570 --> 00:10:17.760
to protect and manage data in a way

203
00:10:17.760 --> 00:10:20.190
that respects individual privacy

204
00:10:20.190 --> 00:10:22.380
and meets legal obligations.

205
00:10:22.380 --> 00:10:24.720
Data compliance and privacy concepts

206
00:10:24.720 --> 00:10:26.970
include privacy applications,

207
00:10:26.970 --> 00:10:30.840
legal considerations, and regulatory consideration.

208
00:10:30.840 --> 00:10:33.990
Privacy applications are tools and systems

209
00:10:33.990 --> 00:10:36.750
designed to safeguard personal information

210
00:10:36.750 --> 00:10:39.060
and ensure it is handled according to policies

211
00:10:39.060 --> 00:10:40.500
and regulations.

212
00:10:40.500 --> 00:10:44.700
A privacy application's purpose is to enhance user trust

213
00:10:44.700 --> 00:10:47.880
and protect data from unauthorized access.

214
00:10:47.880 --> 00:10:50.100
Legal and regulatory considerations

215
00:10:50.100 --> 00:10:52.350
refer to the requirements and standards

216
00:10:52.350 --> 00:10:55.290
set by laws and regulations such as

217
00:10:55.290 --> 00:10:58.560
the General Data Protection Regulation or GDPR,

218
00:10:58.560 --> 00:11:02.610
or the California Consumer Privacy Act or CCPA.

219
00:11:02.610 --> 00:11:05.700
These laws and regulations govern how organizations

220
00:11:05.700 --> 00:11:07.140
must handle data.

221
00:11:07.140 --> 00:11:10.650
In application, a company may use privacy applications

222
00:11:10.650 --> 00:11:12.210
to ensure compliance

223
00:11:12.210 --> 00:11:14.640
with the General Data Protection Regulation

224
00:11:14.640 --> 00:11:17.970
by implementing data encryption and access controls,

225
00:11:17.970 --> 00:11:20.490
while also following legal guidelines

226
00:11:20.490 --> 00:11:23.310
to avoid penalties for non-compliance.

227
00:11:23.310 --> 00:11:26.070
To finish things off, we'll take a short quiz

228
00:11:26.070 --> 00:11:28.770
to see what you learned during this section of the course,

229
00:11:28.770 --> 00:11:31.710
and we will review each of those quiz questions

230
00:11:31.710 --> 00:11:33.840
to fully ensure you can explain

231
00:11:33.840 --> 00:11:35.550
why the right answers were right

232
00:11:35.550 --> 00:11:37.230
and the wrong answers were wrong.

233
00:11:37.230 --> 00:11:38.940
So let's get ready

234
00:11:38.940 --> 00:11:41.190
to dive into data security concepts

235
00:11:41.190 --> 00:11:42.873
in this section of the course.

