What Postings And Taxonomies Can And Cannot Say
Taxonomies Shape What Posting Measures Can See
Posting-derived task, skill, and concept signals are useful, but they are bounded by the taxonomy used to search for them. This page is a methodology caveat rather than a top-level finding. It uses AI-related evidence as a stress test: sparse O*NET task statements, ESCO skill rollups, STEM task agreement outputs, AIMatch scores, and posting keyword concepts do not collapse into the same signal. The same NLx corpus can imply decline, stability, modest growth, or near-tripling depending on what the taxonomy is capable of seeing.
Taxonomy coverage and movement in the indexed AI-related signals
AI-Signal Comparison
Skill, O*NET ML / AI Task, STEM Task, and AIMatch Signals Indexed to ChatGPT Launch (NLx corpus)
Note: Each series is first expressed per 1,000 active jobs in the NLx corpus, then divided by that series' November 2022 value; index = 1.0 in November 2022 for every series. The AI keyword concept line is reserved for a deduplicated any-keyword aggregate from the updated NLx job_concepts.tsv pipeline; component dictionary-family series are shown separately in Figure B.5. O*NET ML / AI tasks are O*NET task statements mentioning AI, artificial intelligence, or machine learning. STEM tasks are O*NET task IDs retained by the prompt-v2 STEM task LLM agreement procedure.
Figure B.1 is the core Finding 2 comparison. Since November 2022, strict AIMatch rises sharply, while ESCO digital skills and STEM-task signals decline and broader information-skill signals remain roughly flat. The divergence is the point: conclusions about technology-related demand depend on what the taxonomy is capable of seeing.
Table B.1. AI-adjacent signal movement since ChatGPT launch (NLx corpus)
STEM Task Agreement Procedure
The STEM-task line in Figure B.1 uses a task-level two-model agreement procedure, not the occupation-level O*NET STEM list alone. Claude Sonnet 4.6 and GPT-4o reviewed candidate O*NET task IDs and retained task IDs only when both models returned YES under the prompt definition of substantively STEM work.
Distinct task IDs reviewed
LLM-validated STEM task IDs
Two-model same-verdict rate
Cohen's kappa
STEM task agreement summary
The procedure reviewed 7,216 complete rows representing 6,848 distinct task IDs. It retained 1,440 rows, corresponding to 1,375 distinct STEM task IDs; the row/task-ID difference reflects task IDs appearing through more than one review pathway. Download the underlying workbook here: STEM task agreement data. The prompt and additional agreement-workbook links are in the Agreement and Audit Trail.
O*NET 27.2 task statements are searched for artificial intelligence, the standalone acronym AI, and machine learning.
Sparse O*NET and ESCO Coverage
Table B.2. O*NET task statements containing AI-related terms
Figure B.2. Sparse O*NET AI Task Intensity
Figure B.3. Narrow ESCO Data Skill Evidence
ESCO terms used here: narrow data-skill groups are S2.7, "analysing and evaluating information and data," and S5.5, "accessing and analysing digital data." Broad rollups are all S2.* information skills and all S5.* digital/computer skills.
Table B.3. Narrow ESCO data-skill evidence by month and corpus
Figure B.4. Broad ESCO Information (S2) and Digital Skill (S5) Rollups
The broad rollups make the contrast visible: status quo skill taxonomies can show substantial information-skill or digital-skill coverage without saying whether postings are asking for the narrower data-science-adjacent work captured by S2.7 and S5.5. Broad rollups from the NLx corpus are assembled from aggregate skill-job counts, so read them as rollup coverage rather than deduplicated job-level incidence.
Posting-Derived AI Keyword Evidence
Figure B.5. AI Job Postings by AI Concept Family (NLx corpus)
The keyword comparison uses AI concept-family aggregates from the NLx corpus rather than USAJOBS ConceptSearch rows, keeping Finding 2's keyword evidence in the same national NLx corpus as Figure B.1. Each line is a pre-aggregated concept-family incidence series: total_jobs is the count of active jobs with at least one match to that concept family in the month, divided by the monthly active-job denominator. The chart does not add component keywords or concept families together, so it avoids double-counting postings that match multiple terms or multiple AI dictionaries. Exact component keyword lists are not redistributed here.
Table B.4. Broad ESCO information and digital skill evidence by month and corpus
AIMatch Code Intensity
Figure B.6. Strict AI Code Intensity
Figure B.7. Lenient AI Code Intensity
Signal Magnitudes (NLx corpus)
Table B.5. AI-adjacent signal magnitudes relative to strict AIMatch (NLx corpus)
AI code profile lookup tables have been moved out of this main caveat page to AI Code Profiles, so this page can keep the focus on taxonomy coverage and denominator sensitivity.
Methodological Demonstration: Denominator Sensitivity
Figure B.8. Methodological Demonstration: Strict AI Signals Indexed to November 2022 (NLx corpus)
Figure B.9. Methodological Demonstration: USAJOBS Strict AI Signals, Indexed to Biden-Administration Mean
Note: These methodological demonstrations hold the numerator definition fixed to strict AIMatch AI codes and vary the denominator or month assignment. The NLx corpus panel compares raw strict-code volume with strict codes per monthly active job, indexed to November 2022. The USAJOBS panel compares raw active-month code volume, strict codes per monthly active posting, strict codes per posting started, and strict codes per posting ended, indexed to the February 2021-January 2025 Biden-administration monthly mean. Ordinary dashboard trend figures use monthly active postings/jobs; these figures show how denominator and timing choices can alter the apparent trend.