Hiring as a Stress Test for Knowledge-First AI

The main problem with AI in hiring isn’t that it makes mistakes. Everyone makes mistakes, that’s hardly news. The problem is different: it starts pretending it already understands everything far too early.

At the input stage it typically has a rather pathetic set of data. A job description, a resume, sometimes a cover letter, sometimes notes from an interview, sometimes nothing at all except a couple of paragraphs of corporate poetry about dynamic environments and fast-paced culture. But already at this stage the system cheerfully starts talking about match scores, fit, suitability, and other machine astrology.

Looks solid. Numbers, percentages, a confident tone. You almost want to believe that something resembling understanding actually happened inside.

Then you look closer and discover there’s no real understanding there at all. There are a few extracted fragments, a few assumptions, a couple of matched keywords, a pile of gaps, and a very strong desire to pass all of that off as knowledge.

From there, the usual circus gets built around this rickety construction. Employers complain about identical AI-generated resumes. Candidates complain about opaque filters and automated rejections. Both sides are nervous, both sides are optimizing something, both sides are trying to guess what exactly is happening inside the black box. As a result the process looks not like working with knowledge, but like a fairly pathetic ritual involving digital tambourines.

The system says: match 95%.

Wonderful. Now we just need to figure out what exactly it means by that.

Which facts were actually extracted. What’s been confirmed. What is simply a statement in the text. Where data is missing. Where there’s ambiguity. Where it ignored the gaps. And where it just neatly plastered everything over with a smooth formulation so the conclusion would look confident and technological.

That’s precisely why the hiring use case seems so useful to me.

Not because the world desperately needs yet another AI widget for HR. There’s already a glut of that stuff, they’ve bred it like pigeons at a train station. But because hiring very clearly illustrates a more general problem: what happens when a system starts performing knowledge before it has actually acquired any.

This is, in fact, exactly what I’m currently picking apart within Cogentis AI.

For me this isn’t a hiring tool in any narrow sense. Hiring here is simply a convenient and fairly brutal test bench. It makes it especially visible how quickly the standard AI approach collapses into an imitation of understanding. The system has a text and a half, two hints, and self-confidence the size of a data center, and it’s already ready to rank people, sort them, issue recommendations, and apparently feel like the pinnacle of engineering thought while doing so.

I’m interested in a different story.

Not “evaluate the candidate.”

Not “calculate fit.”

Not “produce a score.”

But to gather, connect, and progressively refine knowledge.

In reality we have several scattered sources: a role description, requirements, a resume, interviews, score cards, letters, additional clarifications, sometimes notes from the hiring manager that exist in the genre of “something feels vaguely off but I can’t explain it.” Some of this information is formalized. Some exists only as text. Some contradicts other parts. Some is simply absent.

A proper system in this situation shouldn’t be issuing verdicts — it should be building a map.

What is actually known about the candidate. What the position requires. Where there’s confirmed overlap. Where there’s a gap. Where there’s a contradiction. Where the system simply doesn’t have sufficient grounds to assert anything at all. And which questions actually make sense if the goal is to reduce uncertainty — not just to look smart as quickly as possible.

That’s what strikes me as the substantive task.

In this kind of scenario Cogentis AI doesn’t replace the recruiter and doesn’t try to play oracle. It does more boring but far more useful work. It extracts knowledge from different sources, connects them to each other, marks confidence levels, surfaces gaps, and assembles an evolving picture.

The output isn’t “suitable / not suitable,” but something considerably more honest.

What’s confirmed. Confirmed by what, exactly. What remains only a claim so far. Where there are contradictions. What data is missing. Which follow-up questions actually make sense. And why those specifically, rather than a random selection from the corporate grab-bag.

In other words this isn’t the “throw in a resume and pray” story. Not for the candidate, not for the employer.

This is work with incomplete knowledge.

And the first version of this use case already produces a quite observable effect. Not the abstract “something interesting,” the way presentations tend to phrase things when there’s actually not much to show. But a concrete, tangible difference in how the task is represented.

Instead of one smooth and opaque score, you can lay out the picture in layers. What the system was able to extract from the job description. What it saw in the experience description. What looks like a genuine match. What remains an unconfirmed claim. Where information is absent. Where formulations are too vague. Where there’s a conflict between sources. And what needs to be clarified further, if the goal is to understand better — not to cut something off as early as possible.

This, strangely enough, fundamentally changes the logic of the process itself.

Because the task here is not at all about issuing a verdict on a person as quickly as possible. The task is to reduce the area of the unknown as data comes in.

This seems like a fairly obvious thought. But if you look at the AI hiring market, it creates the impression that this particular step is somehow considered optional.

Things are usually arranged more simply there. There’s a job description. There’s a resume. Some kind of magic runs between them. Out comes a confident tone and a poorly explainable result. If you’re very lucky, a couple of pseudo-arguments get added so the rejection or selection doesn’t look entirely like shamanism.

But the problem is that not all claims are equally reliable.

If a person wrote something in their resume, that’s one level. If the same thing came up in the interview, that’s already another. If it’s confirmed across multiple sources, even better. If there’s almost nothing on a point besides a nice-sounding phrase, then that’s not knowledge — it’s a hole, neatly covered over with text.

So what’s more useful at the output isn’t a flat list of “matches / doesn’t match,” but a more honest map: where we have confirmed knowledge, where we have a claim, where there’s a contradiction, and where there’s simply a void.

After an interview you can add the transcript, and the picture updates. You can add score cards from the interviewers, and some things become clearer. You can add correspondence, if important clarifications surface in it. You can update the knowledge about the position itself, if requirements suddenly changed midway through the process. Which, of course, never happens to anyone, ever. People always know perfectly well who they’re looking for, from day one to the very end. Naturally.

There’s another important effect as well.

In a system like this, a candidate doesn’t need to cram themselves into a castrated resume specifically optimized for a five-second scan by a recruiter or an ATS. Right now a person is constantly forced to play a strange game: write briefly enough, smoothly enough, resembling the job posting enough, safely enough, keyword-friendly enough. And at this very stage, part of the real substance is already lost.

Meaning the process hasn’t even started yet, and useful knowledge has already been sacrificed on the altar of machine-scanning convenience.

The more sensible approach would be to let a person talk about their experience in a more human way. What they did. What they were responsible for. What decisions they made. What constraints they worked within. What exactly they built, changed, fixed, broke, and then heroically fixed again.

And then the system’s job isn’t to admire the smoothness of the formulations, but to extract formalizable facts from it. Which technologies and tasks are actually mentioned. What level of responsibility is visible. What can already be matched against the role. What still looks like just a claim. What needs to be verified. What can be confirmed through other sources. What makes sense to ask about in the interview.

Then instead of a primitive keyword game something at least somewhat substantive emerges. The person talks about their experience normally. The system turns that into a more structured and verifiable map. Information loss at the input decreases rather than increases — which is the opposite of what typically happens in the current industrial magic known as efficient hiring.

The result isn’t a magical “candidate assessment” but a living knowledge map, refined as new data arrives.

And that’s precisely why this approach can be useful to both sides.

For the employer it provides not a vague impression and not a mysterious score, but an observable picture: what’s confirmed, what’s missing, where the informational weak points are, which questions still make sense. Not “received a CV, ran it through a filter, and waiting for enlightenment,” but a more consistent effort to reduce incompleteness.

For the candidate it can also offer something beyond an impersonal “you weren’t a fit for us.” Not necessarily a promise of fairness — people remain people, and hiring remains hiring. But at least a more legible picture: what the role requires, where strong overlap is already visible, where the gaps are, what would be worth clarifying, and where the system simply doesn’t know enough to draw a conclusion.

So this isn’t about another automated filter. It’s about a more honest and more observable process, where incompleteness of knowledge isn’t plastered over with confident text but becomes the subject of explicit work.

That’s why this use case interests me. Not as a separate market and not as an end goal, but as a very clear demonstration of why a knowledge-first system is needed in the first place.

In this kind of architecture the LLM isn’t a source of truth and doesn’t make the decision. It can help extract structure, compile a report, formulate questions, bring data into a usable form. But the conclusion itself must be grounded not in the smoothness of the text, but in explicit facts, connections, gaps, confidence levels, and references to sources.

And the further I work on this, the less interest I have in systems that start performing knowledge too early.

Far more interesting are systems that can honestly show the boundaries of what they actually know.

Given how AI in hiring is currently arranged, even that alone sounds almost like a radical idea.