SQL Window Functions for Data Analyst Interviews: A Practical Guide

CrackJobs Team12/3/20265 min read
Share:

SQL Window Functions for Data Analyst Interviews: A Practical Guide

Why window functions are interview-critical

Window functions separate junior data analyst candidates from mid-to-senior ones. Almost every company that interviews data analysts at the mid-to-senior level will test window functions — not as a trick question, but because window functions reflect how real analytics work is done. Understanding when and how to use them signals that you've written production analytics code, not just LeetCode solutions.

The good news: there are four patterns that cover roughly 90% of window function interview questions. Master these four, and you'll handle almost anything an interviewer throws at you.

The syntax to know before anything else

SELECT
  column,
  FUNCTION() OVER (
    PARTITION BY partition_column
    ORDER BY order_column
    ROWS/RANGE BETWEEN frame_start AND frame_end
  ) AS alias
FROM table;

The OVER clause is what makes a function a window function. PARTITION BY divides the data into groups (like GROUP BY, but without collapsing rows). ORDER BY determines the sequence within each partition. The ROWS BETWEEN clause defines how large the "window" of rows is.

Pattern 1: Ranking

Use case: "Find the top 3 products by revenue in each category."

SELECT *
FROM (
  SELECT
    category,
    product_name,
    revenue,
    ROW_NUMBER() OVER (
      PARTITION BY category
      ORDER BY revenue DESC
    ) AS rank
  FROM products
) ranked
WHERE rank <= 3;

When interviewers ask about this, they're also watching:

  • Do you know the difference between ROW_NUMBER(), RANK(), and DENSE_RANK()?
  • ROW_NUMBER() — unique sequential number, no ties
  • RANK() — ties get the same rank, next rank skips (1, 1, 3)
  • DENSE_RANK() — ties get the same rank, no gap (1, 1, 2)

The correct choice depends on how you want to handle ties. Always ask the interviewer: "Should tied values share a rank?" If they say yes, use DENSE_RANK(). If you want exactly N results, use ROW_NUMBER().

Pattern 2: Running totals and moving averages

Use case: "Calculate a 7-day rolling average of daily active users."

SELECT
  date,
  dau,
  AVG(dau) OVER (
    ORDER BY date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
  ) AS dau_7day_avg
FROM daily_active_users;

The frame clause ROWS BETWEEN 6 PRECEDING AND CURRENT ROW is what creates the 7-day window (the current row plus the 6 preceding rows). For a simple running total:

SUM(revenue) OVER (
  ORDER BY date
  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS cumulative_revenue

Interviewers often follow up with: "What's the difference between ROWS and RANGE?" ROWS refers to physical rows. RANGE refers to logical values — if two rows have the same ORDER BY value, RANGE treats them as part of the same window boundary, ROWS does not. For most analytics use cases, ROWS is what you want.

Pattern 3: LAG and LEAD for period-over-period comparisons

Use case: "Calculate week-over-week revenue change for each product."

SELECT
  week,
  product_id,
  revenue,
  LAG(revenue, 1) OVER (
    PARTITION BY product_id
    ORDER BY week
  ) AS prev_week_revenue,
  revenue - LAG(revenue, 1) OVER (
    PARTITION BY product_id
    ORDER BY week
  ) AS wow_change
FROM weekly_revenue;

LAG looks backwards (previous row), LEAD looks forward (next row). Both take an optional offset (default 1) and a default value for when there's no previous/next row.

Common interview extension: "How would you calculate the percentage change?" Add a CASE statement to handle division by zero:

CASE
  WHEN LAG(revenue) OVER (...) = 0 THEN NULL
  ELSE (revenue - LAG(revenue) OVER (...)) / LAG(revenue) OVER (...) * 100
END AS wow_pct_change

Pattern 4: Cohort retention analysis

Retention analysis is the most complex window function pattern — and the most impressive to demonstrate in an interview.

Use case: "Calculate 30-day retention for each weekly signup cohort."

WITH first_activity AS (
  SELECT
    user_id,
    MIN(DATE_TRUNC('week', activity_date)) AS cohort_week
  FROM user_activity
  GROUP BY user_id
),
cohort_activity AS (
  SELECT
    f.user_id,
    f.cohort_week,
    DATEDIFF(a.activity_date, f.cohort_week) AS days_since_signup
  FROM first_activity f
  JOIN user_activity a ON f.user_id = a.user_id
)
SELECT
  cohort_week,
  COUNT(DISTINCT CASE WHEN days_since_signup = 0 THEN user_id END) AS cohort_size,
  COUNT(DISTINCT CASE WHEN days_since_signup BETWEEN 28 AND 35 THEN user_id END) AS retained_30d,
  COUNT(DISTINCT CASE WHEN days_since_signup BETWEEN 28 AND 35 THEN user_id END) * 1.0 /
    COUNT(DISTINCT CASE WHEN days_since_signup = 0 THEN user_id END) AS retention_rate_30d
FROM cohort_activity
GROUP BY cohort_week
ORDER BY cohort_week;

This pattern tests CTEs, date manipulation, conditional aggregation, and the ability to think through a business problem (what does "30-day retention" actually mean?) rather than just write syntax.

How to approach window function questions in interviews

  1. Clarify the data model first. Ask: "Is there one row per user per day, or could there be multiple events per user per day?" The answer changes whether you need to deduplicate before windowing.
  2. Write the PARTITION BY before the function. Identifying how to partition the data is the hardest conceptual step. Get it right first.
  3. Validate your logic out loud. Walk the interviewer through what a small example would produce. "If user A has activity on days 1, 3, and 7, my LAG would give NULL on day 1, day 1 on day 3, and day 3 on day 7..."
  4. Check for edge cases. NULL values in the ORDER BY column, users with only one event (LAG returns NULL), date gaps in rolling average windows.

Practising window functions for real interviews

The difference between knowing window function syntax and being fluent with it under interview pressure is significant. In a data analyst mock interview, you'll be asked window function questions with follow-ups — "what if there are ties?", "what if a user has no activity in a given week?", "how would this query scale on a 10TB table?" — that require you to reason about the SQL, not just recall it.

The best preparation is writing these queries from scratch, out loud, multiple times, with someone watching for the moments where your logic breaks down.

Related Articles
How to Prepare for a Product Manager Mock Interview (And Actually Land the Offer)
Most PM candidates practice the wrong way. Here's the exact preparation framework — from framework fluency to mock interview cadence — that gives you the best chance of converting practice into an offer.
5 min read
How to Prepare for a Data Analyst Mock Interview: The Complete Guide
A step-by-step guide to preparing for data analyst mock interviews. What to expect, how to structure your SQL and case answers, and how to turn mock sessions into real offers.
5 min read
7 SQL Interview Mistakes That Cost Data Analysts the Job (And How to Fix Them)
Most data analysts don’t fail interviews because they don’t know SQL—they fail because of how they think, explain, and apply it. These 7 real SQL interview mistakes explain why.
4 min read
Ready to Put This Into Practice?
Book a 1:1 mock interview with expert mentors from Google, Meta, Amazon and get personalized feedback to ace your next interview