How We Verify Physicians

A methodology paper for the MentionFox Physician Vetting Report. Updated 2026-05-09.

The MentionFox Physician Vetting Report is a research synthesis built from public-record sources. It is NOT a license-board verification of record, NOT a clinical-competence assessment, and NOT a substitute for a National Practitioner Data Bank (NPDB) query through your Credentialing Verification Organization. Read this page in full before relying on the report for a credentialing, hiring, or care decision.

Below: who reads these reports, the data sources we use (and the ones we deliberately don't), the source-class hierarchy, our defamation guardrails, the section-by-section synthesis logic, and the limits of what a public-records pipeline can credibly assert about a clinician.

Who reads these reports

Four use cases drive the report's framing — selectable when you order a report:

Credentialing review — corporate medical groups, hospital credentialing committees, locum tenens placement firms, and concierge practices vetting candidates at scale.
Patient care decision — patients or families researching a physician before a procedure, second opinion, or specialist referral.
Expert-witness due diligence — litigation teams evaluating a candidate as an expert witness in malpractice, product liability, or insurance cases.
Self-verification — the physician reviewing their own public record. The cover page reframes to Verification & Disclosure Summary and the tone shifts neutrally.

The data assembled is the same in all four cases. The framing differs because the reader's question differs.

Subject resolution

Identifying the right person — disambiguating a name like "John Smith" or "Patricia Garcia" — is the most failure-prone step in vetting. We resolve in this priority order:

10-digit NPI — the canonical National Provider Identifier issued by the Centers for Medicare and Medicaid Services. When provided directly, we go straight to the NPI Registry and pull the verified record.
Last name + state + optional first name + optional specialty — when the NPI is unknown, we query the NPI Registry's name index. The result set is presented as a disambiguation chooser: the requester picks the right candidate before any credit is charged. Each candidate card shows full legal name, NPI, primary specialty, primary practice address, and NPI status (Active vs Deactivated).
Disambiguation fallback — when the candidate set is large, we surface additional disambiguators (city, sub-specialty, year of enumeration). We never auto-pick from a multi-candidate result.

The NPI Registry is the federal-primary source for physician identity. It is HIPAA-regulated and operated by the CMS National Plan and Provider Enumeration System (NPPES). When the registry has no record matching the requester's input, the report does not proceed and no credit is charged.

Data sources — what we use

Every report cites at least one Federal-Primary source per high-stakes claim. The sources, all free and publicly maintained:

Source	What it tells us	Class
CMS NPI Registry	Verified identity, primary specialty, sub-specialties, primary practice address(es), enumeration date, NPI status (Active/Deactivated), credential (MD/DO).	Federal-Primary
OIG LEIE	Federal exclusion from Medicare/Medicaid programs. The single highest-stakes signal in any clinician verification — a positive match here is dispositive for credentialing.	Federal-Primary
CMS Open Payments	Industry payments — pharmaceutical, medical device, and group purchasing organization compensation across the available program years. Reported as data, not as accusation.	Federal-Primary
PubMed E-utilities	Total publications, journal mix, search-term audit trail. Returns a publication record we can quantify rather than estimate.	Federal-Primary
NIH iCite	Citation counts per PMID and Relative Citation Ratio (RCR) — the NIH-maintained field-normalized metric for research impact. We compute an h-index proxy from the citation distribution.	Federal-Primary
NIH RePORTER	Federal grant record — total grants, lifetime awarded dollars, active vs. ended projects. Indicates federal-peer-review recognition for research-active subjects.	Federal-Primary
ClinicalTrials.gov	Trial activity — total trials the subject appears in, principal-investigator roles, and current-status breakdown across recruiting / completed / terminated trials.	Federal-Primary
State medical board portals (top 12)	State licensure status, disciplinary actions of record. Each state has a different lookup URL; we surface the URL directly so the credentialing-of-record check is one click away.	Authoritative-Secondary
Hospital and academic affiliations	From NPI registry practice locations cross-referenced with PubMed author affiliations. Used to build the practice-setting picture.	Authoritative-Secondary
Healthgrades, Vitals, Google patient reviews	Patient-experience signals only. Surfaced as patient sentiment, never characterized as clinical-quality data.	Aggregator

What's NOT used (and why)

We deliberately do not consult the following sources, each for a specific reason:

National Practitioner Data Bank (NPDB) — restricted to authorized credentialing entities. We are not authorized; you must query through your CVO or hospital credentialing office. The report explicitly tells the reader to do so for credentialing-of-record decisions.
Paid background-check vendors (Sterling, Checkr equivalents) — out of scope. Public-record synthesis only.
Civil malpractice case databases (paid versions) — see "Malpractice surface" section below for our explicit position. We surface CourtListener public filings only and bound the section with strong disclaimers.
Patient communications, EHR data, or any HIPAA-protected source — out of scope and would be illegal to use without authorization.
Anonymous physician-rating boards (RateMDs, anonymized aggregator forums) — quality-controlled aggregator data is in scope; anonymous-author rating sites are not.
Anything reported by a known unreliable narrator — explicit blocklist of sources with editorial-quality concerns.

Source class hierarchy (ICD 206)

Each cited source falls into one of three classes, weighted differently when the synthesis evaluates evidence strength:

Federal-Primary — directly authored by a US federal agency (CMS, NIH, FDA, OIG). Treated as authoritative for the claim it supports.
Authoritative-Secondary — state-level government, peer-reviewed academic, or specialty-board-issued (ABMS, ABPP). Treated as authoritative within its scope; cross-referenced with Federal-Primary when possible.
Aggregator — third-party platforms aggregating user-generated signal (Healthgrades, Vitals, Google reviews, news indexers). Treated as patient-experience or visibility signal, never as quality assessment.

The report's audit trail (Section 12 of the full Vetting Report tier) groups every cited URL by source class so the reader can verify class-by-class.

Confidence ratings (ICD 203)

Where a section asserts a probabilistic claim — recommendation likelihood, conflict-risk verdict, malpractice probability — it uses the PHIA probability vocabulary:

Almost no chance (0–5%)
Very unlikely (5–20%)
Unlikely (20–45%)
Roughly even chance (45–55%)
Likely (55–80%)
Very likely (80–95%)
Almost certain (95–99%)

This terminology comes from the Office of the Director of National Intelligence's Words of Estimative Probability framework (ICD 203), used because it forces the synthesis to commit to a band rather than waffle. Bands are picked based on data-density, not vibes — when the underlying evidence is thin, the band defaults to "roughly even chance" with an explicit "[insufficient public evidence]" tag.

Defamation guardrails

Physician verification carries elevated legal risk: a false claim that lowers a clinician's professional standing can be defamatory per se. The synthesis follows a strict cite-don't-characterize policy:

State licensure facts — only what the state board's own public record says. No interpretation.
OIG LEIE matches — quoted exclusion type, date, and reinstatement record verbatim from the federal record.
Open Payments — reported as data, with explicit framing that industry payments are not inherently improper. Many physicians legitimately accept research funding, consulting fees, and speaking honoraria.
Patient sentiment — characterized as patient experience, not as clinical quality assessment. We never write "this physician is bad at care."
Civil malpractice filings — when surfaced via CourtListener, framed as litigation activity rather than adjudicated wrongdoing. Many filings are dismissed or settled without admission.

Any claim that could lower a physician's professional standing must cite a Federal-Primary or Authoritative-Secondary source. Aggregator data alone cannot support a reputation-affecting claim.

Section-by-section methodology

1. Executive Summary

Generated last, after the other 11 sections complete. Pulls the verdict-relevant facts from each: NPI status, OIG result, h-index proxy, lifetime industry payments, top specialty, and any HIGH-severity flags. Frames around the requester's use_case (credentialing, patient, expert-witness, self-verification).

2. Identity & Credentials

The verified-identity record from the NPI Registry: legal name, NPI, credential, enumeration date, primary specialty, sub-specialties, license states. Closes with what the registry does NOT tell us — it isn't a license-of-record verification; the credentialing-of-record path runs through the relevant state board.

3. Specialty & Sub-specialty Profile

The primary specialty plus all sub-specialty taxonomies on the NPI record, against typical specialty norms (practice-setting distribution, sub-specialty variants).

4. License & Disciplinary History

The single highest-stakes section. Leads with the OIG LEIE check result. If a match, the section opens with "HIGH SEVERITY RED FLAG" and quotes the exclusion type and date verbatim. If clean, the section says "No federal exclusion or NPI deactivation found across automated checks as of {date}." State medical board lookup URLs are surfaced for direct verification.

5. Publication & Research Record

PubMed total publications, iCite h-index proxy, median Relative Citation Ratio, NIH grants total, ClinicalTrials.gov PI roles. Closes with calibration: a research-active academic profile (high pubs / RCR / grants) vs a clinical-practice-focused profile (low pubs is normal — explicitly noted when total publications are below 20).

6. Industry Compensation Profile (Open Payments)

Top manufacturers, top categories (consulting / food & beverage / travel / research grants / speaking), lifetime total, most-recent program year. Reported factually. Specialty norms briefly noted (high consulting + speaking concentration in cardiology / oncology / orthopedics is common; near-zero is also common in primary care).

6.5. Pharma Conflict Risk Flag

LOW / MODERATE / HIGH framing. HIGH requires a single-manufacturer concentration above 50% of lifetime payments AND lifetime payments above $100,000. MODERATE is top-decile relative to specialty without a research-grant explanation. LOW is everything else, including near-zero. The flag is a risk-review framing, never an indictment.

7. Hospital Affiliations

Practice locations from the NPI record cross-referenced with academic affiliations on PubMed papers and clinical-trial team-member rolls.

8. Comparable Physicians

Five archetype-matched peers from the comparable_physicians_reference table (60 seeded entries). Composite scoring on specialty (×6), academic vs clinical archetype (×3), publication record overlap (×2), career stage (×2). Each comparison cites the peer's source URL.

9. Public Reputation & Patient Sentiment

Currently a thin section in v0.1 — pending Healthgrades / Vitals / Google reviews scrape (v0.2 work). The honest framing is surfaced today: PubMed authorship, NIH funding, ClinicalTrials.gov roles are professional-standing signals, not patient-experience signals. Patient experience needs the deferred aggregator pull.

10. Malpractice Surface

Currently a v0.2 deferral with explicit honest framing. The OIG LEIE check in Section 4 is the strongest automated malpractice-related signal in v0.1. Civil malpractice case surfacing via CourtListener is the v0.2 build.

11. Network & Referral Map

PubMed co-authorship graph (top 5 most-frequent collaborators inferred from top-cited works) plus clinical-trial co-officials. v0.1 caveat: direct referral patterns from claims data are not in the public record.

12. References & Source Citations

JS-aggregated audit trail of every URL cited across the prior 11 sections, deduplicated and grouped by source class (Federal-Primary, Authoritative-Secondary, Aggregator). The reader can verify each claim by opening its citation.

Limitations + what this report is NOT

NOT a license-of-record verification. Credentialing decisions of record require an NPDB query.
NOT a clinical-competence assessment.
NOT medical advice.
NOT a substitute for malpractice case retrieval through qualified medical-legal counsel.
NOT a patient-experience aggregator (yet — Healthgrades / Vitals / Google reviews integration lands in v0.2).
NOT a guarantee against future misconduct. The report reflects the public record at the date generated.

Verifiability

Every claim in a Physician Vetting Report cites a public URL the reader can open and verify. This is not a feature; it is the contract. If a claim cannot be cited, it does not appear in the report — replaced instead with the [insufficient public evidence as of {date}] tag.

This makes the reports auditable in the strict sense: a credentialing committee or compliance reviewer can re-run the verification chain by hand from the citations alone. The report is a research synthesis they can check, not an opaque score.

Related verifications

Run a verification report yourself.
Order a Physician Snapshot or Full Report →