A methodology paper for the MentionFox Physician Vetting Report. Updated 2026-05-09.
The MentionFox Physician Vetting Report is a research synthesis built from public-record sources. It is NOT a license-board verification of record, NOT a clinical-competence assessment, and NOT a substitute for a National Practitioner Data Bank (NPDB) query through your Credentialing Verification Organization. Read this page in full before relying on the report for a credentialing, hiring, or care decision.
Below: who reads these reports, the data sources we use (and the ones we deliberately don't), the source-class hierarchy, our defamation guardrails, the section-by-section synthesis logic, and the limits of what a public-records pipeline can credibly assert about a clinician.
Four use cases drive the report's framing — selectable when you order a report:
The data assembled is the same in all four cases. The framing differs because the reader's question differs.
Identifying the right person — disambiguating a name like "John Smith" or "Patricia Garcia" — is the most failure-prone step in vetting. We resolve in this priority order:
The NPI Registry is the federal-primary source for physician identity. It is HIPAA-regulated and operated by the CMS National Plan and Provider Enumeration System (NPPES). When the registry has no record matching the requester's input, the report does not proceed and no credit is charged.
Every report cites at least one Federal-Primary source per high-stakes claim. The sources, all free and publicly maintained:
| Source | What it tells us | Class |
|---|---|---|
| CMS NPI Registry | Verified identity, primary specialty, sub-specialties, primary practice address(es), enumeration date, NPI status (Active/Deactivated), credential (MD/DO). | Federal-Primary |
| OIG LEIE | Federal exclusion from Medicare/Medicaid programs. The single highest-stakes signal in any clinician verification — a positive match here is dispositive for credentialing. | Federal-Primary |
| CMS Open Payments | Industry payments — pharmaceutical, medical device, and group purchasing organization compensation across the available program years. Reported as data, not as accusation. | Federal-Primary |
| PubMed E-utilities | Total publications, journal mix, search-term audit trail. Returns a publication record we can quantify rather than estimate. | Federal-Primary |
| NIH iCite | Citation counts per PMID and Relative Citation Ratio (RCR) — the NIH-maintained field-normalized metric for research impact. We compute an h-index proxy from the citation distribution. | Federal-Primary |
| NIH RePORTER | Federal grant record — total grants, lifetime awarded dollars, active vs. ended projects. Indicates federal-peer-review recognition for research-active subjects. | Federal-Primary |
| ClinicalTrials.gov | Trial activity — total trials the subject appears in, principal-investigator roles, and current-status breakdown across recruiting / completed / terminated trials. | Federal-Primary |
| State medical board portals (top 12) | State licensure status, disciplinary actions of record. Each state has a different lookup URL; we surface the URL directly so the credentialing-of-record check is one click away. | Authoritative-Secondary |
| Hospital and academic affiliations | From NPI registry practice locations cross-referenced with PubMed author affiliations. Used to build the practice-setting picture. | Authoritative-Secondary |
| Healthgrades, Vitals, Google patient reviews | Patient-experience signals only. Surfaced as patient sentiment, never characterized as clinical-quality data. | Aggregator |
We deliberately do not consult the following sources, each for a specific reason:
Each cited source falls into one of three classes, weighted differently when the synthesis evaluates evidence strength:
The report's audit trail (Section 12 of the full Vetting Report tier) groups every cited URL by source class so the reader can verify class-by-class.
Where a section asserts a probabilistic claim — recommendation likelihood, conflict-risk verdict, malpractice probability — it uses the PHIA probability vocabulary:
This terminology comes from the Office of the Director of National Intelligence's Words of Estimative Probability framework (ICD 203), used because it forces the synthesis to commit to a band rather than waffle. Bands are picked based on data-density, not vibes — when the underlying evidence is thin, the band defaults to "roughly even chance" with an explicit "[insufficient public evidence]" tag.
Physician verification carries elevated legal risk: a false claim that lowers a clinician's professional standing can be defamatory per se. The synthesis follows a strict cite-don't-characterize policy:
Any claim that could lower a physician's professional standing must cite a Federal-Primary or Authoritative-Secondary source. Aggregator data alone cannot support a reputation-affecting claim.
Generated last, after the other 11 sections complete. Pulls the verdict-relevant facts from each: NPI status, OIG result, h-index proxy, lifetime industry payments, top specialty, and any HIGH-severity flags. Frames around the requester's use_case (credentialing, patient, expert-witness, self-verification).
The verified-identity record from the NPI Registry: legal name, NPI, credential, enumeration date, primary specialty, sub-specialties, license states. Closes with what the registry does NOT tell us — it isn't a license-of-record verification; the credentialing-of-record path runs through the relevant state board.
The primary specialty plus all sub-specialty taxonomies on the NPI record, against typical specialty norms (practice-setting distribution, sub-specialty variants).
The single highest-stakes section. Leads with the OIG LEIE check result. If a match, the section opens with "HIGH SEVERITY RED FLAG" and quotes the exclusion type and date verbatim. If clean, the section says "No federal exclusion or NPI deactivation found across automated checks as of {date}." State medical board lookup URLs are surfaced for direct verification.
PubMed total publications, iCite h-index proxy, median Relative Citation Ratio, NIH grants total, ClinicalTrials.gov PI roles. Closes with calibration: a research-active academic profile (high pubs / RCR / grants) vs a clinical-practice-focused profile (low pubs is normal — explicitly noted when total publications are below 20).
Top manufacturers, top categories (consulting / food & beverage / travel / research grants / speaking), lifetime total, most-recent program year. Reported factually. Specialty norms briefly noted (high consulting + speaking concentration in cardiology / oncology / orthopedics is common; near-zero is also common in primary care).
LOW / MODERATE / HIGH framing. HIGH requires a single-manufacturer concentration above 50% of lifetime payments AND lifetime payments above $100,000. MODERATE is top-decile relative to specialty without a research-grant explanation. LOW is everything else, including near-zero. The flag is a risk-review framing, never an indictment.
Practice locations from the NPI record cross-referenced with academic affiliations on PubMed papers and clinical-trial team-member rolls.
Five archetype-matched peers from the comparable_physicians_reference table (60 seeded entries). Composite scoring on specialty (×6), academic vs clinical archetype (×3), publication record overlap (×2), career stage (×2). Each comparison cites the peer's source URL.
Currently a thin section in v0.1 — pending Healthgrades / Vitals / Google reviews scrape (v0.2 work). The honest framing is surfaced today: PubMed authorship, NIH funding, ClinicalTrials.gov roles are professional-standing signals, not patient-experience signals. Patient experience needs the deferred aggregator pull.
Currently a v0.2 deferral with explicit honest framing. The OIG LEIE check in Section 4 is the strongest automated malpractice-related signal in v0.1. Civil malpractice case surfacing via CourtListener is the v0.2 build.
PubMed co-authorship graph (top 5 most-frequent collaborators inferred from top-cited works) plus clinical-trial co-officials. v0.1 caveat: direct referral patterns from claims data are not in the public record.
JS-aggregated audit trail of every URL cited across the prior 11 sections, deduplicated and grouped by source class (Federal-Primary, Authoritative-Secondary, Aggregator). The reader can verify each claim by opening its citation.
Every claim in a Physician Vetting Report cites a public URL the reader can open and verify. This is not a feature; it is the contract. If a claim cannot be cited, it does not appear in the report — replaced instead with the [insufficient public evidence as of {date}] tag.
This makes the reports auditable in the strict sense: a credentialing committee or compliance reviewer can re-run the verification chain by hand from the citations alone. The report is a research synthesis they can check, not an opaque score.