AzureDP-900

DP-900 Azure Data Fundamentals: Relational Data, Non-Relational Data, and Analytics

DP-900 is Microsoft's foundational data certification. It builds conceptual understanding of data concepts, Azure data services, and the difference between relational and non-relational data — without requiring hands-on configuration or SQL expertise. This is the entry point for data engineering, analytics, and database administration paths on Azure. It is also a useful credential for anyone working with data in a business context who wants to understand what the data team actually does.

10 min
3 sections · 10 exam key points

Core Data Concepts

DP-900 covers fundamental data literacy concepts. Data formats: structured data (rows and columns — relational databases), semi-structured data (flexible schema — JSON, XML, key-value pairs), unstructured data (no schema — images, videos, documents). Data storage: OLTP (Online Transaction Processing — many concurrent small reads/writes, optimised for transactions, low latency — Azure SQL Database, Cosmos DB), OLAP (Online Analytical Processing — large complex queries over historical data, optimised for aggregation — Azure Synapse Analytics, Azure Analysis Services). Data roles: Database Administrator (manage and maintain databases — performance, availability, security), Data Engineer (build data pipelines and storage infrastructure — ETL, data lake design), Data Analyst (query and visualise data to support business decisions — Power BI, SQL queries), Data Scientist (build predictive models — ML, statistics). ETL (Extract, Transform, Load): move data from source to target with transformations — Azure Data Factory orchestrates ETL pipelines.

Azure Relational and Non-Relational Data Services

Relational databases: Azure SQL Database (managed SQL Server, PaaS — elastic scale, built-in HA), Azure SQL Managed Instance (near-100% SQL Server compatibility — for complex migrations), Azure Database for PostgreSQL and MySQL (open-source managed relational databases). Relational concepts: normalisation (eliminate redundancy — 1NF, 2NF, 3NF), primary and foreign keys (enforce referential integrity), ACID transactions (Atomicity, Consistency, Isolation, Durability — guarantee data integrity). Non-relational (NoSQL) databases: Azure Cosmos DB (globally distributed, multiple APIs: NoSQL for documents, MongoDB, Cassandra, Gremlin, Table), Azure Cache for Redis (in-memory key-value), Azure Table Storage (simple NoSQL key-value). NoSQL trade-offs: flexible schema, horizontal scale, high availability — at the cost of reduced consistency guarantees (eventual consistency in distributed scenarios). Azure Blob Storage: for unstructured data (images, videos, documents, backups). Data Lake Storage Gen2: hierarchical namespace over Blob for big data analytics workloads.

Analytics and Visualisation on Azure

Azure analytics services for DP-900: Azure Synapse Analytics (unified analytics platform — SQL, Spark, pipelines, Power BI integration in one workspace), Azure Databricks (Apache Spark-based analytics and ML — collaborative notebooks, MLflow for ML lifecycle management), Power BI (business intelligence and data visualisation — datasets, reports, dashboards, published to Power BI Service). Power BI components: Power BI Desktop (report authoring tool), Power BI Service (cloud publishing and sharing), Power BI Mobile (view on mobile devices). Report vs dashboard: reports are multi-page interactive documents; dashboards are single-page tiles pinned from reports — dashboards give a high-level view. Real-time analytics: Event Hubs ingests streaming data, Stream Analytics processes in-flight data with SQL-like queries, Power BI real-time streaming datasets display live data. Batch analytics: Azure Data Factory orchestrates data movement, Synapse Analytics queries the data, Power BI reports the results.

Key exam facts — DP-900

  • OLTP: many small transactional reads/writes; OLAP: complex analytical queries over historical data
  • ETL: Extract (source), Transform (clean/reshape), Load (target) — Azure Data Factory orchestrates
  • ACID transactions: Atomicity, Consistency, Isolation, Durability — relational database guarantee
  • Azure Cosmos DB: globally distributed NoSQL — multiple API choices (NoSQL, MongoDB, Cassandra, Gremlin)
  • Power BI: Desktop for authoring, Service for publishing, Mobile for viewing
  • Synapse Analytics: unified SQL + Spark + pipelines + Power BI in one platform
  • Data Lake Storage Gen2 = hierarchical namespace over Azure Blob for big data
  • DP-900 is conceptual — no coding or hands-on configuration required
  • Data Engineer builds pipelines; Data Analyst creates reports; Data Scientist builds models
  • Normalisation reduces redundancy; denormalisation improves read performance at the cost of duplication

Common exam traps

NoSQL databases are always faster than relational databases

NoSQL databases trade certain consistency guarantees for horizontal scalability. For many transactional workloads, a well-designed relational database with proper indexing outperforms a NoSQL alternative.

Data warehouses and databases serve the same purpose

Databases (OLTP) are optimised for concurrent transactions. Data warehouses (OLAP) are optimised for complex analytical queries across large historical datasets. The storage and indexing strategies differ fundamentally.

Practice this topic

Test yourself on DP-900 Data Fundamentals

JT Exams routes you to questions in your exact weak areas — automatically, after every session.

No credit card · Cancel anytime

Related certification topics