Data Preparation and Power Query
Power BI data preparation happens in Power Query (also called the M language engine). Import mode: data copied into Power BI's in-memory engine (VertiPaq columnar database) — fast queries, data refreshes required to update. DirectQuery mode: Power BI sends queries directly to the source database at query time — always current, but slower queries and limited DAX support. Composite models: mix Import and DirectQuery in one report — cached tables for frequently used dimension data, DirectQuery for large fact tables. Power Query transformations: merge queries (SQL JOIN equivalent — left outer, inner, full outer), append queries (UNION — stack tables with same columns), pivot/unpivot columns, split column by delimiter, replace values, filter rows, add custom columns using M expressions. Data types and quality: change column types (Text, Whole Number, Decimal, Date, Boolean), remove duplicates, fill down (fill null values from the value above — for merged cells), trim and clean (remove whitespace and non-printable characters). Query folding: Power Query pushes transformations back to the source database as SQL — only works for relational sources in DirectQuery/Import; transformations that cannot fold run in Power Query engine — fold as much as possible for performance.
Data Modelling and DAX
Data modelling in Power BI: create relationships between tables (defined by matching column — Power BI auto-detects). Relationship properties: cardinality (one-to-one, one-to-many, many-to-many), cross-filter direction (single — default, filters flow from one to many; both — bidirectional, for complex models, use cautiously). Star schema: fact table (transactional data — high row count) surrounded by dimension tables (descriptive attributes — date, customer, product) — optimal for Power BI performance and DAX simplicity. DAX (Data Analysis Expressions): formula language for calculated columns, measures, and calculated tables. Measures vs calculated columns: measures evaluate dynamically in filter context (recalculated for every visual), calculated columns are computed row by row at refresh time and stored in the model. Key DAX functions: CALCULATE (modifies filter context — most important DAX function), SUM/SUMX (X suffix = iterator), FILTER (returns filtered table), ALL/ALLEXCEPT/ALLSELECTED (remove filter context), RELATED (look up value from related table), IF/SWITCH, DATEADD/DATESYTD (time intelligence). CALCULATE example: Sales LY = CALCULATE([Total Sales], DATEADD(Dates[Date], -1, YEAR)).
Report Design and Power BI Service
Report design: choose the right visual for the data type (bar chart for comparison, line chart for trends, scatter plot for correlation, matrix for tabular cross-tab, card for KPIs, map for geographic data). Slicers: interactive filters for report users — sync slicers across pages for consistent filtering. Drill-through: right-click to navigate to a detail page filtered by the selected data point — configure on the target page's drill-through fields. Row-level security (RLS): restrict data visible to specific users — create roles in Power BI Desktop (DAX filter expressions per table), assign users to roles in Power BI Service. Static RLS (explicit list of users per role) vs Dynamic RLS (username() function compared to a user table — scales without modifying roles). Power BI Service: publish reports from Desktop, create dashboards by pinning visuals from reports, schedule data refresh (up to 8 times per day on Pro, 48 times per day on Premium), share via workspace (colleagues who are members see all content), per-item sharing, or publish to public web. Power BI Premium Per Capacity: dedicated capacity for large organisations — paginated reports (SSRS-style pixel-perfect printing), AI visuals, deployment pipelines (DEV > TEST > PROD promotion), dataflows (reusable Power Query transformations in the cloud).