Agreement And Audit Trail
This page keeps reproducibility details out of the main findings narrative while making the public transparency path explicit. Project materials are organized through the Job Ad Research at QSB-LUC GitHub organization and Loyola Data Mining Hugging Face collections.
Public Project Links
Core public project resources:
- Job Ad Research at QSB-LUC
- JAAT software
- JAAT models
- USAJOBS raw text
- USAJOBS JAAT-coded output
- USAJOBS HTML metadata
Dashboard source, public dashboard artifacts, and working paper links should be added here when their public URLs are created.
Table 1. Public project links and repositories
Reproducibility Package
Table 2. Public release bundle components
Required Checks
Table 3. Public release agreement checks
Job-Ad-Derived Occupation-Task Pair Agreement Procedure
This section covers the job-ad-derived novel occupation-task pair LLM agreement procedure. It is one LLM agreement procedure in the broader project, not the only agreement activity.
Table 4. Job-ad-derived occupation-task pair agreement outputs
LLM Judge Prompt Templates
The prompts below are the literal templates used for the agreement procedures. Runtime calls fill the bracketed fields with the occupation, task, and definition text shown in each template; model outputs are parsed only as YES or NO.
Occupation-task pair agreement prompt
Occupation: {soc_title} ({occ8})
Occupation description: {soc_description}
Task: {task_text}
Does this task plausibly belong to this occupation? Answer with one word: YES or NO.Green task signal prompt
DEFINITION. The green economy is "the economic activity related to reducing
the use of fossil fuels, decreasing pollution and greenhouse gas emissions,
increasing the efficiency of energy usage, recycling materials, and
developing and adopting renewable sources of energy."
[Source: O*NET Green Occupations report, https://www.onetcenter.org/dl_files/Green.pdf]
A task is "substantively green work" if its TEXT directly advances one of
these green-economy activities. Generic operations, generic maintenance, and
administrative tasks are NOT substantively green UNLESS the task text itself
specifies an environmental, energy-efficiency, conservation, pollution-
reduction, recycling, or renewable-energy objective.
Examples:
- YES: "Develop strategies to address energy use, recycling, pollution reduction."
- YES: "Monitor emission control devices for compliance with regulations."
- YES: "Compute design specifications for implementation of conservation practices."
- YES: "Maintain wind turbines or solar arrays to ensure peak energy production."
- NO: "Assist engineering personnel to solve operating problems."
- NO: "Maintain equipment, making repairs as necessary."
- NO: "Schedule new employee orientations."
Task: {task_text}
Is this task substantively green work? Answer with one word: YES or NO.STEM task signal prompt
DEFINITION. STEM occupations "require education in science, technology,
engineering, and mathematics (STEM) disciplines."
[Source: O*NET STEM page, https://www.onetonline.org/find/stem]
A task is "substantively STEM work" if performing it requires applying
specific scientific, technological, engineering, or mathematical knowledge.
Routine skilled trades, installation, repair, and maintenance are NOT
substantively STEM UNLESS the task text requires technical analysis,
engineering design, scientific measurement, mathematical modeling,
programming, diagnostics, instrumentation, or interpretation of technical
data.
Examples:
- YES: "Conduct statistical analyses of biological or environmental data."
- YES: "Design experiments to test theoretical hypotheses."
- YES: "Develop algorithms to solve scientific or mathematical problems."
- YES: "Calibrate avionics instrumentation using interferometry."
- YES: "Interpret CAD/CAM models to verify part tolerances against engineering specs."
- NO: "Weld metal parts together using portable gas welding equipment."
- NO: "Maintain, adjust, and clean equipment, perform minor repairs."
- NO: "Lay out pipe routes following blueprints."
Task: {task_text}
Is this task substantively STEM work? Answer with one word: YES or NO.Downloadable Two-Model Agreement Workbooks
The workbooks below are the coauthor/public inspection copies of the Sonnet + GPT-4o two-model agreement procedures. Each workbook consolidates the reviewed rows, LLM-validated rows, disagreed/blocklisted rows, agreement statistics, and metadata for one run. Legacy llm_as_judge_* CSV filenames remain in crosswalks/ only as compatibility inputs for existing SQL and warehouse rebuilds.
| Procedure | Workbook |
|---|---|
| Occupation-task pairs, top RCA batch | XLSX |
| Occupation-task pairs, high-volume batch | XLSX |
| Occupation-task pairs, random postings-derived batch | XLSX |
| Occupation-task pairs, independent null baseline | XLSX |
| Green task bundle | XLSX |
| STEM task bundle | XLSX |
Download agreement workbook manifest CSV
Green and STEM Task Agreement Procedures
The green and STEM task reviewers are separate LLM agreement procedures used for Finding 3 task-domain time series. They use prompt v2 and require a two-model unanimous YES for inclusion; Gemini was quota-blocked and is diagnostic only in this run.
Table 5. Green and STEM task agreement outputs
The green and STEM task workbook links above supersede the older per-CSV public downloads. Their workbook tabs are metadata, reviewed, agreement_retained, disagreed, agreement_stats, and discovery_rejected.
Public Inputs And Inspection Paths
Table 6. Public input and provenance inventory
Rebuild Path
bin/reproduce.sh is the executable path from public inputs to the dashboard. With --skip-etl, it starts from the staged DuckDB warehouse, rebuilds derived artifacts, rematerializes Evidence source parquets, and builds the static site. With --public-usajobs-dir, --nlx-dir, and --crosswalk-dir, it rebuilds the warehouse from public USAJOBS parquets, public aggregate files from the NLx corpus, and frozen crosswalk snapshots before rebuilding the dashboard outputs.
Public-Safe Limits
Table 7. Public-safe limits and gated artifacts
Documentation Map
Table 8. Audit documentation index