For HRTech, alignment is just as important as lack of bias

Traditional matchmakers did far more than just present their clients with a ‘feed’ of profile cards. They got to know them from a holistic perspective as members of their community, through a mix of both high-pressure and casual interaction. Most importantly, they provided advice based on their judgement, integrating selection with growth so that the process did not just become an examination ritual.

The growth of dating apps over the past decade gave rise to a ‘dimensional collapse.’ An arms race replaced quality, in-depth of interactions with quantity. The rich information originally shared through in-person interactions – or even in-depth personality questionnaires – gradually collapsed into just a few variables, mostly physical appearance. Such distortions also started affecting offline preferences as well, all within a relatively short period (compared to the usual pace of social changes). Reacting to these dynamics, usage of dating apps has started to gradually decline since the pandemic, particularly among the younger generation.

A similar phenomenon can also be observed in job seeking – another type of matchmaking. US Fed Chairman Jerome Powell recently made note of the “low firing, low hiring environment,” which he called an “interesting labor market.” Anecdotally, a similar arms race dynamic is at play, leading to burnout by some applicants. Employers, for their part, are using AI (alongside older types of software) simply to deal with the deluge of applications, rather than adding new value, sometimes even with the main purpose of adding frictions to the application process.

Even given an objective of increasing efficiency on both sides, however, bias is inevitable as AI systems gradually assume greater importance in employee selection. The skills needed for interviewing will never be a perfect proxy for those needed on the job, and the mismatch can potentially affect both company culture and larger social concepts of fairness. More data does not necessarily correspond to less bias; higher-bandwidth data sources, such as video, have the potential to further amplify this divergence.

Recent studies have indicated that AI-powered interview systems suffer from multimodal bias. Peña (2020) found that even with equal qualifications, female and older applicants may still receive lower scores in text, voice, and image analysis, averaging 7-12% lower, particularly in the financial services industry. Booth (2021) found that a combination of system modalities affects the degree of bias: text-based bias is lower, while the addition of voice and image increases biases related to gender, accent, age, and appearance. Chen (2023) demonstrated a multimodal bias aggregation mechanism, including training data bias, annotator bias, algorithmic bias, and presentation bias, which can lead to complex discrimination against non-mainstream groups.

In fact, however, headaches over bias in hiring long predate the ongoing AI explosion. In 2014, a resume screening tool developed by Amazon was discovered to exhibit indirect gender bias through features such as educational history in women’s colleges. The problem was insidious enough that engineers eventually abandoned the project in 2018.

Employment laws in different jurisdictions list out certain protected classes. Taiwan, for instance, has 18 protected classes, which include ideology, appearance, facial features, and even zodiac sign and blood type. Any of these characteristics could potentially exhibit correlations with useful judgment criteria, even if they are not directly added to the model.

While bias remains an extremely difficult problem to comprehensively solve, alignment is a parallel problem which may be worth pursuing in its own right. That term is often used in Silicon Valley to describe properties inherent to an AI model itself: do its goals reflect those of its creators? Has reinforcement learning imparted the correct lessons within it? Nevertheless, that narrow perspective neglects the importance of model deployers. Recruitment is considered a high-risk application of AI, with even greater social impact and scope for subjectivity than credit underwriting, so it can also be worth also thinking about alignment in the sense of intelligent market structure.

The video interviewing system created by TABF for bank HR departments reflects this thinking. During the interview, applicants simulate common banking scenarios, like handling of customer complaints, credit review, cross-sales, and KYC review, in addition to the more standard interview format. After the system grades their performance across a variety of dimensions, the applicants are then later given access to a training module, which specifically explains which parts of the answer were done well, what could be improved, and what statements or wording might be misunderstood in a financial context (with examples for reference). In this way, the system promotes not only the most efficient selection process possible, but also broader labor upskilling.

It is difficult to regulate this aspect of AI use, although it certainly conforms with the spirit of Taiwan’s AI Basic Act. Given the inherent complexity of multimodal AI, sometimes the “best-effort” principle can be the most practical, yet meaningful legal basis.

Early resume keyword scanners are a great demonstration of the dimensional collapse similar to that caused by dating apps – leaving job-seekers frustrated that they may have been rejected for only lacking a peripheral skill, while employers also complained about the lack of qualified candidates. Modern systems have come a long way, yet it is still an imperative of AI safety to actively guide job seekers who wish use the interview process to improve their skills. In this way, HRTech can provide a social function beyond just the interests of both counterparties – just as the older tradition of matchmakers also arguably did.