Welcome to this DP-900 session, where we dive into the fundamental data concepts essential for working with Azure data services. Whether youβre an aspiring data professional, cloud enthusiast, or preparing for the DP-900 certification, this lecture will give you a solid foundation in relational, non-relational, big data, and analytics concepts.
π What Youβll Learn in This Session:
β Introduction to Core Data Concepts β Structured vs. Unstructured Data
β Relational vs. Non-Relational Databases β Key Differences & Use Cases
β Big Data & Analytics β Processing & Storing Large-Scale Data
β Data Storage in Azure β SQL, NoSQL, Blob Storage & More
β Exploring Batch vs. Stream Processing β Real-time Data Insights
β Azure Data Services Overview β Azure SQL, Cosmos DB, Synapse Analytics & More
π οΈ Who Should Watch This?
Beginners & IT professionals exploring data fundamentals
Students & data enthusiasts preparing for the DP-900 exam
Business users & analysts looking to understand data management
Developers working with Azure data storage & analytics solutions
π Key Highlights:
β Easy-to-understand explanations of essential data concepts
β Practical use cases & real-world examples in Azure
β Exam-focused insights to help you pass the DP-900 certification
β Live demos of data storage & analytics tools in Azure
π‘ Master core data concepts and take your first step into Azure data services!
Explore Our Other Courses and Additional Resources on: https://skilltech.club/
π What Youβll Learn in This Session:
β Introduction to Core Data Concepts β Structured vs. Unstructured Data
β Relational vs. Non-Relational Databases β Key Differences & Use Cases
β Big Data & Analytics β Processing & Storing Large-Scale Data
β Data Storage in Azure β SQL, NoSQL, Blob Storage & More
β Exploring Batch vs. Stream Processing β Real-time Data Insights
β Azure Data Services Overview β Azure SQL, Cosmos DB, Synapse Analytics & More
π οΈ Who Should Watch This?
Beginners & IT professionals exploring data fundamentals
Students & data enthusiasts preparing for the DP-900 exam
Business users & analysts looking to understand data management
Developers working with Azure data storage & analytics solutions
π Key Highlights:
β Easy-to-understand explanations of essential data concepts
β Practical use cases & real-world examples in Azure
β Exam-focused insights to help you pass the DP-900 certification
β Live demos of data storage & analytics tools in Azure
π‘ Master core data concepts and take your first step into Azure data services!
Explore Our Other Courses and Additional Resources on: https://skilltech.club/
Category
π€
TechTranscript
00:00Now before we move forward in Azure Data Fundamentals, we have to focus on our module number one
00:14which is exploring core data concepts. There are various data concepts which you need to
00:20understand and you need to use in the cloud data and that's what we are going to focus
00:25in this module. This module focuses on five lessons actually.
00:29The first lesson is going to explore the core data concepts. The second one is going to
00:33focus on roles and responsibilities in the world of data. Obviously if you are a database
00:39administrator or you are data analyst or data engineer, the way you are going to deal with
00:43data is going to be different. The lesson three is going to focus on describe concepts of relational
00:49data and lesson four is going to focus on concepts of non-relational data. Same way at the end
00:55of this module, we are also going to focus on the concept of data analytics and then we
01:00can see the various types of analytics which we can try with data.
01:05In this video, we are going to focus on explore core data concepts. This lesson is actually
01:10focusing on identifying how data is defined and stored in its various formats. We are also
01:15going to focus on characteristics of relational and non-relational data. We will describe and
01:21differentiate how the data workloads are going to be stored and then we are also going to
01:25describe and differentiate batch and streaming data because we need to see how the stream processing
01:32and the batch processing is happening with the data. Let's start with what is data? Well,
01:37the definition says that data is nothing but a collection of facts, number, descriptions,
01:43objects stored in a structured, semi-structured or maybe unstructured way and as this definition
01:50says, this shows me that data can be stored in various formats actually. So, it can be structured,
01:56semi-structured and unstructured. If you have ever dealt with the data which is something like a
02:02nodes or some kind of a file system or some kind of a not pair file, then you can count it like that data
02:07is something which is fully unstructured. Even the data which is in the form of images and videos are also
02:13coming under the unstructured data format studio. Same way, in spite of this unstructured way,
02:18suppose if you go with the little bit more organized way, then it will be semi-structured data
02:22where the data is going to be stored in a form of key value pair and as visible on the screen,
02:27semi-structured data is going to have this kind of key value pair kind of data stored,
02:31which is not fully structured but still we can understand what kind of data we have stored in that.
02:36And now, if you want to go with the proper tabular data where the row and column kind of structure
02:41we are going to follow, we are also going to have things like primary key, foreign key,
02:45unique key kind of constraints and if you are following the proper schema every time when you
02:51store the data, then that is going to be your structured data. Now, when we talk about data,
02:56obviously where we are storing data is also equally important and whenever you store data,
03:02we call that thing data store. It is important to understand the distinctions between OLTP and OLAP data,
03:10which normally I call OLAP. Now, if you see OLTP stands for Online Transactional Processing,
03:17while OLAP stands for Online Analytical Processing. In OLTP data is stored one transaction at a time.
03:24Most of the time when you have any applications in which you have performed some data operations like
03:30inserting data or updating data, then that is something which is a transactional processing and
03:36that data comes under OLTP data store. While there is another kind of data store which is OLAP. OLAP is an
03:43aggregated system. Data is most of the time imported from other OLTP systems and then aggregated.
03:50The aggregation will typically occur at the different levels. So, there are chances that
03:55when you are storing data with OLAP data store, the aggregation is going to happen at different levels
04:00and it allowing you to drill down or up for multiple levels. For example, you have a company summary to
04:07a department summary to maybe in a department particular employee summary is going to be there.
04:13When you drill down into next next next kind of level, then the further more detailed data you can
04:18get into that. Because OLAP data is aggregated periodically, once the aggregation has been performed,
04:25queries which we need to summarize that it contains are going to be very fast. So, compared to OLTP,
04:31OLAP queries are going to be faster. Now, if you ask me when we have to choose which data store,
04:37then I will say that OLAP is good for many but small updates that happen frequently,
04:42while OLAP is good for large summarized queries that happen periodically but not very frequently.
04:49Now, after the data stores, we need to understand transactional workloads also.
04:54Transaction data is information that tracks the interactions related to an organization's activity.
05:00Now, when you are focusing on this, you have to focus on four words which are atomicity, consistency,
05:07isolation and durability. Formally, these four words are known as ACID properties and I am sure
05:14if you are somehow a tech or IT person, then you have surely focused on these four words or you have
05:20shortly heard these four words earlier. That's the reason I am not focusing on this ACID properties in depth right now.
05:27But yes, that is also something which is very important point to understand.
05:30After this transactional workload, this is somewhere which I want to focus. We have analytical workload.
05:37Obviously, transactional workload focus on OLTP same way analytical workload focus on OLAP.
05:43Analytical workloads are typically read-only system that stores the vast volume of historical data
05:49or business metrics. Analytics can be based on a snapshot or maybe a data at a given point of time
05:55or maybe a series of snapshots can be there. An example of analytical information is like a report
06:01or monthly sales or any particular data on which you want to perform some analytical operations
06:07and then you want to generate some graph structures or some kind of historical reports based on that.
06:13After the workload, now it's the time to understand data processing. Data processing is
06:18conversion of raw data to a meaningful information through a process. We know that we can store data
06:25data from various data sources. Maybe you have data which is stored from some server side logs,
06:31maybe you have some organizational servers running from couple of months and couple of years or maybe
06:37the data which is coming from some kind of devices like IoT devices or some machines which are running in
06:42the organization then those data is actually something which are actually raw data. Now when you have to
06:49process the data, you have two different kind of data processing batch processing and stream processing.
06:55Batch processing is suitable for handling large data sets efficiently. Stream processing is intended
07:01for individual records or something like a micro batches consisting of few records only. So when you
07:07have huge data, then you are going to focus on batch processing. When you have limited amount of data,
07:12then you can do with the stream processing. In batch processing, data elements are collected into a group
07:17and the whole group is then processed at a future time as a batch. So that's why we call it batch processing
07:22because the all group is going to be keeping a batch of the data into that. In stream processing,
07:28each new piece of the data is processed when it arrives and this is something like a live streaming
07:33of the data. Think about this that you have some kind of a data which is stream continuously as a video
07:40or as a price chart of the currency conversion or maybe any latest stock market or cryptocurrency
07:48costing analysis which is continuously streaming from one end and then you have to process it
07:53accordingly. The same way stream processing is there where every new piece of data which is coming into
08:00the stream is going to be uploaded and then it's going to be further processed into that same pipeline.
08:05So after this, we need to understand latency which is nothing but a time taken for the data to be received
08:11and processed into that. Comparatively, batch processing is actually going to have a latency of hours
08:17because the amount of data is going to be huge and the stream processing is going to take comparatively
08:23lower latency which is obviously going to be in seconds or milliseconds actually. So in this video,
08:29you have focused on the core data concepts where you have learned that what type of data you have
08:34like structured, semi-structured and unstructured data. We also focused on data stores where transactional
08:41and analytical data stores can be there where you can see that we have OLAP and OLTP. We also focused
08:48on the transactional and analytical workloads and then you have understood batch processing and stream processing
08:54in data processing.