AI Jobs Disruption, BI Interview Reinvention, and GitHub's Privacy Betrayal
Published on 04.04.2026
30 BI Engineering Interview Questions That Actually Matter in the AI Era
TLDR: The BI engineering interview process hasn't kept pace with how the job actually works in 2026. Anusha Kovi, a data and BI engineer focused on governed AI for data platforms, shares the 30 questions she'd ask candidates today instead of the ones that were being asked when she interviewed in 2022.
Summary: I find this piece refreshing because someone finally said the quiet part out loud: half the questions being asked in BI interviews four years ago are now basically useless. Anusha's framing is simple. The job has changed, and the interview hasn't. In 2026, a BI engineer isn't just writing SQL and shipping dashboards. They're building governed data products, reasoning about trust in natural language interfaces, and figuring out how AI-assisted querying changes what the underlying data model even needs to look like.
The sample question that leads the piece is telling: write a query to find the second-highest revenue day per product category in the last 90 days. That's not a trick question, it's a real one. But the interesting thing isn't whether the candidate gets the window function right. It's whether they reason about what "last 90 days" means at query time versus data freshness time, whether they consider edge cases in sparse data, whether they'd write this differently if it were going to power an AI-generated answer rather than a human-read report.
What I appreciate most is that Kovi is writing from lived experience, not from a listicle generator. She's building NL analytics pipelines. She knows that when your business users are asking questions in natural language and expecting accurate answers, the tolerance for a slightly wrong metric definition goes from "annoying" to "catastrophic." The interview questions she's designing are meant to surface people who understand that distinction.
Twenty-seven minutes of reading is a commitment, but if you hire BI engineers or are preparing to interview for these roles, this one is worth the time. The industry keeps hiring for skills that peaked in relevance around 2020. That's not good for anyone.
Key takeaways:
- BI interviews still test 2020-era skills; the job has fundamentally shifted toward governed AI pipelines
- Natural language analytics raises the stakes on data quality and metric consistency
- Good interview questions in 2026 probe reasoning about freshness, trust, and AI-readiness of data models
- The gap between what interviewers ask and what the job requires is a hiring risk, not just a candidate inconvenience
Why do I care: From an architecture perspective, the shift Kovi is describing is real and has teeth. When your data layer starts feeding AI-generated answers directly to end users, every ambiguous column name, every inconsistent grain, every undocumented business rule becomes a liability. I've seen teams build beautiful BI tooling that completely fell apart when connected to an LLM interface because nobody had ever pressure-tested the semantic layer. Hiring people who understand that connection early is about ten times cheaper than retrofitting it later.
30 BI Engineering Interview Questions That Actually Matter in the AI Era
Microsoft Generative AI Report: The 40 Most Disrupted Jobs and The 40 Most Secure Jobs
TLDR: Based on a Microsoft Research study analyzing 200,000 real-world interactions, there are clear patterns in which professions generative AI is actively disrupting and which ones remain well-insulated. The breakdown is more nuanced than the usual "AI replaces everything" narrative.
Summary: Reports like this one land differently depending on where you sit. If you're a developer, you might assume you're safe. If you're a content writer, you might assume you're not. Microsoft's data from 200,000 actual interactions tells a more interesting story than either assumption.
The study looks at real usage patterns, not theoretical capability. That matters because there's a significant difference between what AI can do in a lab and what it actually replaces in a working environment. Jobs that appear most disrupted tend to share a set of characteristics: they involve producing artifacts that can be evaluated quickly, they follow patterns that can be learned from large datasets, and they don't require sustained accountability over time. Jobs on the secure side tend to involve physical presence, high-stakes judgment in novel situations, or social trust that takes years to build.
What strikes me reading this is that the disruption isn't evenly distributed within industries. Not all developers are in the same position. A developer who writes boilerplate CRUD apps all day is in a different situation than one who spends their time reasoning about distributed systems failure modes or untangling decade-old codebases. The same pattern holds across almost every field. The part of the job that's standardized is at risk. The part that requires genuine judgment in ambiguous situations is not, at least not yet.
The 21-minute read is worth it for the actual list, which Microsoft drew from empirical data rather than pundit opinion. Whether or not you agree with the methodology, seeing your own role mapped against these criteria is a useful exercise.
Key takeaways:
- Study based on 200,000 real-world interactions, not hypothetical capability assessments
- Most disrupted jobs involve repeatable artifact production and pattern-based evaluation
- Most secure jobs involve physical presence, novel judgment, or long-term social trust
- Within any profession, the standardized portion of the role carries the most disruption risk
Why do I care: I care because I work with teams who are hiring right now, and the temptation to stop hiring for certain roles because "AI can do that" is real and often premature. The nuance this report surfaces is that AI disrupts tasks inside jobs more than it eliminates jobs wholesale. As an architect, when I'm thinking about team composition over the next three to five years, it's the task-level disruption I need to plan around. That means being very deliberate about which skills I'm investing in growing internally versus accepting that tooling will handle them.
Microsoft Generative AI Report: The 40 Most Disrupted Jobs and The 40 Most Secure Jobs
GitHub Wants Your Private Code to Train AI. What's Your Move?
TLDR: GitHub announced that starting April 24, it will use Copilot interaction data, including code from private repositories, to train AI models by default. The same week, Copilot was caught injecting ads into 1.5 million or more pull requests. This isn't a hypothetical trust erosion. It's already happening.
Summary: Let me be direct about this one: what GitHub is doing here is a textbook opt-out-by-default data grab dressed up in product language. "Helping us improve Copilot" sounds reasonable until you realize that improving Copilot means training on your proprietary business logic, your internal APIs, your unreleased features, and your client data, without your explicit consent.
The ad injection incident in pull requests is what really bothers me, not just as a privacy issue but as a platform integrity issue. Pull requests are a professional communication channel. When GitHub's tooling starts inserting sponsored content into that workflow, they've crossed a line that has nothing to do with AI training and everything to do with how the company views its relationship with its users. These two incidents happening in the same week is not a coincidence. It's a pattern.
The developer community's reaction has been fractured, which is actually the worst possible outcome for GitHub. Some people are already migrating to Codeberg. Many more are opting out and staying, which means they're still on the platform but now distrust it. A smaller group doesn't care. None of these responses rebuild what GitHub has spent fifteen years constructing: the sense that it's a neutral, trustworthy infrastructure layer for software development. That perception, once gone, is very hard to get back.
The practical question for anyone running a development team right now is: have you gone through your GitHub organization settings and explicitly disabled this? The opt-out clock is running. April 24 is not far away.
Key takeaways:
- GitHub will use private repo code for AI training by default starting April 24, 2026; opt out is available but not automatic
- Copilot was caught injecting ads into pull requests, affecting over 1.5 million PRs
- Both incidents together signal a shift in how GitHub monetizes developer trust
- Self-hosting alternatives like Codeberg are seeing renewed interest as a result
- If you manage a GitHub organization, audit your privacy settings before April 24
Why do I care: I run code in private repos. My clients run code in private repos. The idea that GitHub's default assumption is "you consent to training" rather than "you do not consent to training" is backwards from how data privacy should work. I'm not opposed to opt-in programs where teams voluntarily contribute to improving tools they depend on. That's a reasonable value exchange. Opt-out-by-default for commercial AI training on proprietary code is not that. It's extraction dressed up as improvement, and the framing matters.
Poll: GitHub Wants Your Private Code to Train AI. What's Your Move?