JSON-LD JobPosting: how to find roles before aggregators index them

Almost every job posting you've ever looked at had a hidden version of itself sitting alongside the visible HTML - a clean, machine-readable description of the role with structured fields for title, location, date posted, salary range, and employer. It's called JSON-LD JobPosting, it's embedded as a script tag on the page, and it's the reason Google Jobs shows you a tile with a salary band and a "5 hours ago" timestamp while LinkedIn is still telling you the role was posted "2 days ago".

Understanding this format does two useful things. It tells you why some channels are structurally faster than others (Google Jobs uses JSON-LD directly; LinkedIn doesn't). And it gives you a clean route to monitor company careers pages without trying to parse HTML that wasn't built to be parsed.

What it actually is

JSON-LD ("JSON for Linked Data") is a way of embedding machine-readable data inside a web page. The JobPosting schema, defined at schema.org/JobPosting, is one of dozens of types Google uses to ingest structured information from the web.

Why every company emits it: Google requires it. If a role doesn't have a valid JobPosting JSON-LD block, it doesn't appear in Google Jobs - and Google Jobs is increasingly where candidates start their search. So every modern ATS (Workday, Greenhouse, Lever, Ashby, Phenom, iCIMS) emits this structured data on its public posting pages, whether the employer asks for it or not.

What it looks like

If you view the source of a typical Greenhouse-hosted job posting and search for application/ld+json, you'll see something close to this:

{
  "@context": "https://schema.org",
  "@type": "JobPosting",
  "title": "Senior Backend Engineer",
  "description": "We're looking for...",
  "datePosted": "2026-05-12",
  "validThrough": "2026-08-12",
  "employmentType": "FULL_TIME",
  "hiringOrganization": {
    "@type": "Organization",
    "name": "Acme",
    "sameAs": "https://acme.example"
  },
  "jobLocation": {
    "@type": "Place",
    "address": {
      "@type": "PostalAddress",
      "addressLocality": "London",
      "addressCountry": "GB"
    }
  },
  "baseSalary": {
    "@type": "MonetaryAmount",
    "currency": "GBP",
    "value": {
      "@type": "QuantitativeValue",
      "minValue": 90000,
      "maxValue": 130000,
      "unitText": "YEAR"
    }
  }
}

The fields are largely self-explanatory. The two you'll care about most as a job seeker are datePosted (the canonical "when was this role published" timestamp) and baseSalary (when present; UK job postings under recent transparency moves and California pay-transparency laws are increasingly required to include it).

Why this matters for job search

Three practical reasons.

1. It gives you the canonical posted date

The datePosted field is the source-of-truth timestamp. When an aggregator says a role was "posted 3 days ago", they're showing the date they first ingested it, not the date the company actually published it. The JSON-LD on the company's own page gives you the real number. Our measurement of the ATS-to-LinkedIn delay is built on this comparison.

2. It explains why Google Jobs is faster than LinkedIn

Google Jobs ingests directly via JSON-LD. As soon as Googlebot crawls a careers page (which happens within hours for most company sites), the role appears in Google Jobs search results. LinkedIn and Indeed have to do their own crawling, parsing, deduplication and classification, which is where their 1 to 5 day delay comes from. The cost analysis of aggregator delay walks through this.

3. It gives you a clean route to monitor companies yourself

If you're building any kind of monitoring, JSON-LD is dramatically better than parsing raw HTML. The data is already structured. You don't need CSS selectors that break when the company redesigns their careers page. You just look for <script type="application/ld+json"> blocks and parse them as JSON.

How to inspect it yourself

In any browser:

Open a job posting page (try a Greenhouse or Lever URL).
Right-click and choose "View page source" (or press Ctrl+U / Cmd+U).
Search for application/ld+json.
You'll find one or more JSON blocks. The one with "@type": "JobPosting" is the role data.

You can validate it with Google's Rich Results Test tool, which both confirms the JSON-LD is well-formed and tells you whether Google Jobs would index it.

Which ATSes emit JSON-LD reliably

Greenhouse: yes, on every public job posting page. Cleanest implementation.
Lever: yes, with a few minor schema quirks.
Ashby: yes, well-structured.
Workday: partial. Many Workday-hosted careers sites emit it; some don't, depending on the employer's configuration.
Phenom: yes, often with richer metadata than the average.
iCIMS: yes, though sometimes the JSON-LD is loaded asynchronously and won't appear in raw HTML fetched without a browser engine.
SmartRecruiters: yes.
SAP SuccessFactors: patchy, depends heavily on tenant configuration.
Custom in-house systems: highly variable. Larger companies (Google, Meta, Amazon, Apple) have well-formed JSON-LD; smaller companies frequently don't.

Our complete ATS reference covers how to identify which system a company uses.

A minimal "be your own crawler" walkthrough

For technical readers, here's the rough shape of a Python script that monitors a single careers page for new JSON-LD JobPosting entries:

import json, re, requests, hashlib
from bs4 import BeautifulSoup

def fetch_postings(url):
    html = requests.get(url, headers={"User-Agent": "Mozilla/5.0"}).text
    soup = BeautifulSoup(html, "html.parser")
    out = []
    for tag in soup.find_all("script", type="application/ld+json"):
        try:
            data = json.loads(tag.string)
        except (json.JSONDecodeError, TypeError):
            continue
        items = data if isinstance(data, list) else [data]
        for item in items:
            if item.get("@type") == "JobPosting":
                out.append(item)
    return out

# Run on a schedule; diff against previous run; email new entries.

The full machinery (handling JavaScript-rendered pages, dealing with rate limits, deduplicating across runs, parsing salary into a queryable form) is straightforward but more involved. Our full guide to monitoring careers pages compares this approach with the alternatives.

What this means for your job search

If you're searching primarily on aggregators, you are inherently downstream of the structured data the company already published. The same JSON-LD that fed Google Jobs the day the role went live is sitting on the company's page waiting to be read, and the aggregator's redacted, classified, day-late copy is what you're looking at instead.

The practical implication: Google Jobs is structurally faster than LinkedIn or Indeed because of how the pipeline works. For roles where same-day application matters, prefer Google Jobs over LinkedIn search if you're not using direct ATS monitoring. Our comparison of the three approaches walks through the practical tradeoff.

The takeaway

JSON-LD JobPosting is the quiet primitive sitting underneath most of the modern job-search infrastructure. The same data that fed Google Jobs the moment a role went live is sitting on the company's page waiting to be read. Aggregators add their own ingestion, parsing, deduplication and classification layers on top - which is why they're a day or two behind the canonical source.

For most candidates, the practical implication is simpler than the technical story: if you're choosing between job-search channels, Google Jobs is structurally fresher than LinkedIn because it reads JSON-LD directly. If you're building any kind of monitoring yourself, JSON-LD is the thing to read - not the rendered HTML, and definitely not the aggregators' indexed copy.

← Back to blog · Try FirstPost free →