Evaluation info

The evaluation for TalentCLEF-2026 will be conducted on Codabench. Submissions will be ranked using Mean Average Precision (MAP) and nDCG (normalized Discounted Cumulative Gain).

For local testing, an evaluation script is available in our GitHub repository. You can access it here: Evaluation Script. You can find a tutorial on the process of generating submission files and evaluating them in the Tutorials section of the Additional resources page

TalentCLEF Task A - Codabench

Codabench

TalentCLEF Task B - Codabench

Codabench

Evaluation dates

Task A:

  • Codabench: Link
  • Start Date: 13st April 2026
  • End Date: 3rd May 2026

Task B:

  • Codabench: Link
  • Start Date: 13st April 2026
  • End Date: 3rd May 2026

Evaluation Criteria

The top-performing teams will be determined based on the following evaluation criteria:

  1. Task A:

    • Best Overall Multilingual Performance – The highest-performing system across English and Spanish, measured as the average Mean Average Precision (MAP) in en-en and es-es.
    • Best Cross-Lingual Performance – The best-performing system in cross-lingual scenarios, calculated as the MAP in en-es.
    • Best Bias-Controlled Model – The system that minimizes performance differences across different gender groups.

    During the CLEF workshop, certificates will be awarded to the first and second-best systems in each category.

  2. Task B: The highest-performing system based on Normalized Discounted Cumulative Gain (NDCG).

    • Graded Relevance – Skills are weighted differently: core skills (weight = 2) and contextual skills (weight = 1). This approach distinguishes the importance of different skill types udring the evaluation.
    • Binary Relevance – All relevant skills (both core and contextual) are treated equally with the same weight, without distinction.
    • The graded relevance NDCG will be used to determine the top-performing team.
    • Additional metrics may be reported in the overview paper.

    During the CLEF workshop, certificates will be awarded to the best-performing systems.