Before an investor ever wires money to a brokerage, there is a free public tool that will tell them whether the firm is real, whether it is allowed to do business in their state, and whether it has ever been fined, sued, or thrown out of the industry. FINRA's BrokerCheck is that tool, and behind it sits a registry of every broker-dealer firm in the United States: roughly 13,300 firm records, one row per registered firm, each keyed by a CRD number and carrying the firm's name, its registration status, the states and jurisdictions where it may operate, the kinds of securities business it conducts, and its disclosure history—the regulatory actions, arbitrations, and financial events that flag where the risk is.
This article covers what the BrokerCheck firms dataset is and the unusual fact that it is maintained not by a federal agency but by a self-regulatory organization operating under SEC oversight; the Central Registration Depository and the CRD number that keys the whole system; the difference between FINRA and the SEC and why that structure shapes the data; registration status and scope—the states, jurisdictions, and business types a firm is authorized for; the disclosure events that turn a bare registration record into a risk profile; the relationship between the firm registry and the parallel record for individual registered representatives; the broker-misconduct research literature that has been built almost entirely on this data; the compliance-screening and analytical uses; a Python workflow that queries BrokerCheck's sanctioned public services to tally firms by status and state and surface those carrying disclosures; and the caveats—self-reported fields, the active-only snapshot, the limits on bulk access, and the gap between a disclosure and a finding—that every analyst must keep in mind.
What the dataset is
FINRA—the Financial Industry Regulatory Authority—is the organization that licenses and regulates the broker-dealer firms and the individual brokers who buy and sell securities for the public across the United States. It is not a government agency. It is a self-regulatory organization (SRO): a private, non-governmental body that the securities laws authorize to write and enforce rules for its member firms, operating under the supervision of the Securities and Exchange Commission. BrokerCheckis FINRA's free public tool that discloses the registration, history, and disciplinary record of brokerage firms and individual brokers, so that an investor can vet whom they are about to do business with before opening an account. The firms half of BrokerCheck— the subject of this piece—is the registry of registered broker-dealers, and in our database it is stored as the table finra_firms, comprising roughly 13,300 firm records spanning both the broker-dealers currently registered (about 3,300 today) and the historical and deregistered firms BrokerCheck retains.
The grain of the table is one row per registered firm. Each row captures who the firm is, whether it is currently registered, where it is allowed to operate, what kind of securities business it does, and a summary of its disclosure history. The columns are:
crd_number -- Central Registration Depository ID (the primary key)
firm_name -- the firm's name as registered
other_names -- doing-business-as and prior names
status -- registration status (active / inactive)
sec_number -- the firm's SEC registration (8-NNNNN) number
main_office -- main office city, state, ZIP
registered_states -- states / jurisdictions where the firm is registered
self_regulatory_orgs-- SROs the firm belongs to (FINRA, exchanges)
business_types -- the securities business activities the firm conducts
disclosure_count -- number of disclosure events on the firm's record
disclosure_events -- regulatory actions, arbitrations, financial events
registration_date -- when the firm first registeredThe crd_number is the load-bearing column. The CRD number is the firm's identifier in the Central Registration Depository, the licensing and registration system that FINRA operates for the securities industry; it is a persistent integer assigned to a firm when it first registers and it never changes, even as the firm renames itself, merges, or relocates. It is the join key that ties a firm's registration record to its disclosure history, to the records of the individual brokers it employs, and to the firm's entry in companion systems. The status field records whether the firm is currently registered or whether its registration has lapsed, been withdrawn, or been revoked— a firm that has been expelled from the industry is not an active broker-dealer even if its historical record persists. The registered_states and business_types fields define the firm's authorized footprint: a broker-dealer is registered to operate in specific jurisdictions and to conduct specific kinds of business, and a firm acting outside that scope is itself a compliance problem. The disclosure_events field is the substantive payload—it is what turns a bare registration record into a risk profile, and it is where the analytical value of the dataset concentrates.
The CRD and the Central Registration Depository
Everything in the BrokerCheck firms data hangs off the Central Registration Depository (CRD), the computerized registration and licensing system that FINRA maintains for the US securities industry. The CRD is the back-office system of record; BrokerCheck is the public window into a curated subset of it. When a firm wants to operate as a broker-dealer, it registers through the CRD—filing the uniform application for broker-dealer registration (Form BD)—and is assigned a CRD number. Individual brokers register through the same system (filing the uniform application Form U4) and receive their own individual CRD numbers. The CRD therefore links every firm to every broker it has employed and every broker to every firm they have worked at, building the employment-and-registration graph that underlies the whole industry's oversight.
The crucial property of the CRD number is its persistence and uniqueness. Because the number is assigned once and never reused or changed, it survives the events that would otherwise break identity resolution: a firm that rebrands, that is acquired and folded into a parent, or that changes its legal form keeps the same CRD number, and a broker who moves between five firms over a career carries the same individual CRD number through all of them. This is what makes the data tractable for analysis that names are not: matching a firm by name across datasets is fraught—names are duplicated, abbreviated, and reused—but matching by CRD number is exact. For an analyst, the CRD number is the equivalent of the CIK in SEC filings or the NDC in drug data: the stable key that lets the firm record join cleanly to everything else. The firm CRD also distinguishes the registry described here from the individual-broker records that share the same BrokerCheck front end; a firm CRD and an individual CRD are drawn from the same numbering space but identify different kinds of entity, and conflating them is a common beginner mistake.
FINRA, the SEC, and the self-regulatory structure
The single most important structural fact about this dataset—and the reason it sits in a federal-data collection while being described as “federal-adjacent”— is that FINRA is a self-regulatory organization, not a federal agency. The US securities laws, principally the Securities Exchange Act of 1934, establish a layered system in which the SEC, a federal agency, sits at the top with statutory authority over the securities markets, but much of the day-to-day regulation of broker-dealers is delegated to SROs that the SEC oversees. FINRA is the SRO for the broker-dealer industry. It was created in 2007 from the consolidation of the National Association of Securities Dealers (NASD) with the member-regulation functions of the New York Stock Exchange, bringing the rule-writing and examination of broker-dealers under a single organization.
What FINRA does, it does subject to SEC supervision. FINRA writes rules for its member firms, but those rules must be filed with and approved by the SEC; FINRA examines and disciplines member firms and brokers, but its disciplinary decisions can be appealed to the SEC and from there to the federal courts; and FINRA operates the registration and licensing machinery—the CRD, the qualification examinations, BrokerCheck itself—under authority the federal framework grants it. The practical consequence for the data is that the firm registry is an SRO product reflecting an SRO's membership and disciplinary record, not a federal agency's administrative database. That distinction matters for scope: BrokerCheck covers firms and brokers within FINRA's registration system, and while that is nearly the entire retail broker-dealer industry, conduct that is purely a matter for the SEC, for state securities regulators, or for other regulators may sit outside or only partly inside the FINRA record. The disclosure history on a firm's record does pull in events from those other regulators—a firm sanctioned by the SEC or by a state will generally carry that as a disclosure—but the organizing frame is FINRA's, and an analyst should understand the registry as the membership-and-discipline record of the broker-dealer SRO operating under SEC oversight, not as a direct extract from a federal agency.
Registration status and scope
A broker-dealer's entry in the registry is, at its core, a statement of what the firm is authorized to do and whether that authorization is currently in force. Three groups of fields define this, and they are the first thing any screening or analysis reads.
Registration status is the threshold fact. A firm is either currently registered as a broker-dealer or it is not. Active registration means the firm has filed and maintained its Form BD, is a current member in good standing of FINRA, and is authorized to conduct securities business. A firm can lose that status in several ways—by voluntarily withdrawing its registration when it winds down, by having its registration lapse for failure to maintain the requirements, or, at the punitive end, by being suspended or expelled through a disciplinary action. The distinction between a firm that withdrew quietly and a firm that was expelled is enormous from a risk standpoint, and both leave the firm without active status, so status alone is not enough—the disclosure history is what tells the two apart.
Jurisdictional scope is recorded in the registered states and SRO memberships. Broker-dealer regulation is genuinely dual: a firm registers at the federal/SRO level and also registers (or notice-files) in each state where it does business, under that state's blue-sky securities laws administered by the state securities regulator. The registry therefore lists the jurisdictions in which a firm is registered, and that list is meaningful—a small regional broker-dealer registered in three states is a different proposition from a national wirehouse registered in every state and territory. A firm soliciting business in a state where it is not registered is itself a violation, so the registered-states field is both a description and a compliance boundary. Business types complete the picture by recording what the firm actually does—whether it acts as a broker or dealer in corporate securities, municipal securities, mutual funds, variable annuities, options, or other products, whether it carries customer accounts or clears through another firm, and whether it engages in activities such as private placements or investment banking. Two firms with identical active status can have completely different risk surfaces depending on whether one is a plain-vanilla mutual-fund distributor and the other a complex-product, customer-carrying operation; the business-types field is what separates them.
Disclosure events: the risk payload
The reason BrokerCheck exists, and the reason this dataset is a compliance-screening source rather than a mere directory, is the disclosure. A disclosure is a reportable event in a firm's history that bears on its integrity or its dealings with customers, and FINRA's rules require firms to report them through the CRD, where they become part of the public record. The disclosure history is the difference between knowing that a firm exists and knowing whether it can be trusted, and it falls into a few recognizable families.
Regulatory actions are the most serious family. These are formal actions taken against the firm by a regulator—FINRA itself, the SEC, a state securities regulator, or another SRO or exchange—for violating a rule or a securities law. They span the spectrum from a censure and a modest fine for a recordkeeping or supervision lapse, through suspensions of particular business lines, up to the severest sanction an SRO can impose: expulsion from FINRA membership, which ends the firm's ability to operate as a broker-dealer. FINRA enforcement—the fines, suspensions, and expulsions it hands down—is recorded here, and the aggregate of these actions across the membership is, in effect, the public ledger of how the SRO polices its members.
Arbitrations and customer disputes are the second family, and they are distinctive because they originate not with a regulator but with customers. Most disputes between an investor and a broker-dealer are resolved not in court but through FINRA-administered arbitration, the forum specified in the account agreements that customers sign. A customer who alleges that a firm churned their account, sold them unsuitable products, misrepresented an investment, or failed to supervise a rogue broker brings that claim in FINRA arbitration, and the claim and its outcome—an award against the firm, a settlement—become a disclosure on the firm's record. Customer-dispute disclosures are a leading indicator of a different kind of risk than regulatory actions: they reflect the experience of the firm's actual customers rather than the judgment of a regulator. The third family, financial events, covers the firm's financial integrity—bankruptcies, the appointment of a receiver, certain liens and judgments, and similar events that bear on whether the firm is financially sound enough to hold customer assets. A firm that has been through bankruptcy, or that carries unsatisfied judgments, presents a financial-stability risk on top of any conduct risk. Read together, the disclosure count and the categories of disclosure on a firm's record are the single most informative thing in the registry: a firm with no disclosures across decades of operation is a very different counterparty from one carrying a thick file of regulatory actions and customer awards.
Firms and the parallel record for brokers
The firm registry is one half of BrokerCheck; the other is the record for individual registered representatives—the brokers themselves. The two are deeply intertwined, because a firm acts through its brokers, and understanding the relationship is essential to using the firm data well.
Every individual who sells securities for a broker-dealer must register through the CRD, filing Form U4 and passing the qualification examinations for the activities they will conduct. Each broker has their own individual CRD number and their own BrokerCheck record, which carries their employment history (every firm they have been registered with, by CRD number and dates), their qualifications, and—exactly as with firms—their own disclosure history of regulatory actions, customer disputes, terminations, financial events, and criminal matters. The firm CRD and the broker CRD link the two records: from a firm you can enumerate the brokers it currently employs and, historically, the brokers it has employed; from a broker you can trace their movement across firms. This linkage is what makes the most powerful analyses possible. A firm's risk is not only in its own disclosure file but in the disclosure histories of the brokers it hires—a firm that systematically recruits brokers with prior misconduct is a recognizable risk pattern that only the firm-to-broker join reveals.
It is worth noting where the boundary of the system lies. FINRA's CRD covers broker-dealers and their registered representatives. A closely related but distinct population—investment adviser firms and their representatives—is registered through the parallel Investment Adviser Registration Depository and surfaced publicly through the Investment Adviser Public Disclosure (IAPD) system, which FINRA operates on behalf of the SEC and the states. Many firms and individuals are dually registered—a broker-dealer that is also an investment adviser, a representative who is both a registered broker and an investment adviser representative—and BrokerCheck and IAPD are designed to cross-reference each other. For an analyst, the consequence is that the broker-dealer registry described here is one facet of a slightly larger registration universe, and a complete picture of a financial professional sometimes requires reading the adviser record alongside the broker-dealer record.
The broker-misconduct research literature
One measure of how valuable this dataset is to analysts is that an entire academic literature on financial-adviser and broker misconduct has been built almost entirely on the CRD/BrokerCheck record. Because the data is national, person- and firm-resolved by CRD number, longitudinal (employment and disclosure histories span careers), and public, it has become the standard empirical foundation for studying how misconduct happens, where it concentrates, and how the industry responds to it.
The findings that this literature has surfaced illustrate the kinds of questions the data answers. Misconduct is concentrated, not uniformly distributed—a minority of firms and a minority of brokers account for a disproportionate share of disclosures, so risk is highly predictable from history. There is recidivism and mobility: brokers with prior misconduct are more likely to engage in future misconduct, and—strikingly—they often move to firms with their own elevated misconduct records, a sorting pattern that only the firm-to-broker employment graph in the CRD makes visible. There are firm-level cultures: some firms tolerate and even attract problem brokers while others do not, a difference detectable in the relative disclosure rates across the membership. And the record supports the study of how misconduct is distributed across customers, including evidence that it falls more heavily on less-sophisticated and more-vulnerable investors. The relevance to a practical analyst is direct: these are not merely academic findings but validated patterns that justify treating a firm's and its brokers' disclosure history as a genuine, forward-looking risk signal—the empirical basis for the compliance-screening uses described next.
Compliance screening and analytical uses
Because the registry is public, structured, CRD-keyed, and carries a disclosure history, it supports a set of uses ranging from the single investor lookup BrokerCheck was built for to systematic, dataset-scale analysis.
Counterparty and vendor compliance screening is the foundational use. Before an investor opens an account, a fund places business with a broker-dealer, or an institution onboards a financial counterparty, the registry answers the threshold questions: is this firm actually a registered broker-dealer, is it registered in the relevant state, is its status active, and what does its disclosure history show? A serious screen does not stop at the firm record—it follows the CRD links to the named principals and key brokers, because a clean firm record staffed by brokers with heavy disclosure histories is a risk the firm record alone conceals. This is the broker-dealer equivalent of the multi-list counterparty screening that compliance programs run against federal exclusion and sanctions lists, and it slots naturally into the same pipeline.
Disclosure-rate and enforcement-pattern analysisexploits the dataset at scale. Aggregating disclosure counts across the membership reveals how risk concentrates—which firms carry the heaviest disclosure burdens, how disclosure rates vary by firm size, business type, and registration footprint, and how the mix of regulatory actions, customer disputes, and financial events differs across the industry. Tracking FINRA enforcement—the fines, suspensions, and expulsions—over time shows the cadence and focus of the SRO's policing. Firm-to-broker network analysis is the most distinctive use: building the employment graph from the CRD links lets an analyst trace where problem brokers cluster, identify firms that disproportionately hire previously-disclosed brokers, and follow cohorts of brokers as an expelled firm collapses and its representatives scatter to new firms. Finally, market-structure and industry-trend analysis—counting firms by status, registration scope, and business type over time—documents consolidation, the decline in the number of independent broker-dealers, and shifts in the kinds of business the industry conducts. None of this is visible from a single BrokerCheck lookup; it requires treating the registry as a dataset.
Python workflow: querying the BrokerCheck firm services
The script below queries the public search services that back the BrokerCheck website, pulls a slice of active firm records, and computes the core tallies: firms by registration status, firms by main-office state, and the share of firms carrying at least one disclosure event. An important caveat governs the whole approach. BrokerCheck is a public-facing investor tool, not a bulk-data product; FINRA is a self-regulatory organization and its terms of use restrict automated scraping, so programmatic bulk access is limited. Any workflow should use FINRA's sanctioned access, respect the rate limits, and confirm the current method against FINRA's documentation before running at scale—the script is deliberately polite (it paginates modestly, sleeps between requests, and caps the demo pull) and resolves field names defensively because the service envelope shape is version-dependent.
import requests, time
import pandas as pd
from collections import Counter
# FINRA BrokerCheck public system -- brokercheck.finra.org.
#
# BrokerCheck is a public-facing tool, not a bulk-data product. FINRA
# is a self-regulatory organization and its terms of use restrict
# automated scraping; programmatic bulk access is limited. Any workflow
# should use FINRA's sanctioned access (the public BrokerCheck search
# services that back the website) and confirm the current method and
# rate limits against FINRA's documentation before running at scale.
#
# The endpoints below are the JSON services the BrokerCheck site itself
# calls. Treat the exact paths as version-dependent and isolate them.
SEARCH = "https://api.brokercheck.finra.org/search/firm"
DETAIL = "https://api.brokercheck.finra.org/firm"
HEADERS = {"Accept": "application/json"}
def search_firms(query, rows=100, start=0):
# 'hl' enables result hits; 'wt=json' asks for JSON. The service
# paginates with 'start' and 'rows'.
params = {
"query": query, "filter": "active=true",
"hl": "true", "wt": "json",
"rows": rows, "start": start,
}
r = requests.get(SEARCH, params=params, headers=HEADERS, timeout=60)
r.raise_for_status()
return r.json()
def firm_detail(crd):
# Per-firm record keyed by the firm CRD number.
r = requests.get(f"{DETAIL}/{crd}", params={"wt": "json"},
headers=HEADERS, timeout=60)
r.raise_for_status()
return r.json()
def _docs(payload):
# The search service wraps results in a Solr-style envelope; the
# firm content sits in hits.hits[]._source. Shapes vary by release,
# so dig defensively.
hits = (payload.get("hits") or {}).get("hits") or []
out = []
for h in hits:
src = h.get("_source") or h
out.append(src)
return out
# --- Pull a slice of active firms ------------------------------------
records, start = [], 0
while True:
page = search_firms("*", rows=100, start=start)
docs = _docs(page)
if not docs:
break
records.extend(docs)
print(f" fetched {len(records):,} firm records...")
start += 100
if start >= 500: # demo cap; remove for a full pull
break
time.sleep(0.5) # be polite to a public service
df = pd.DataFrame(records)
print(f"Firm records pulled: {len(df):,}")
def col(frame, *cands):
low = {c.lower(): c for c in frame.columns}
for c in cands:
if c.lower() in low:
return low[c.lower()]
return None
c_status = col(df, "firm_status", "status", "registration_status")
c_state = col(df, "firm_other_names", "main_office_location", "state")
c_disc = col(df, "firm_disclosure_fl", "disclosure_count", "disclosures")
# --- 1. Tally firms by registration status ---------------------------
if c_status:
print("\nFirms by registration status:")
for s, n in df[c_status].fillna("(blank)").value_counts().items():
print(f" {str(s):<24} {n:>6,}")
# --- 2. Tally firms by main-office state -----------------------------
if c_state:
print("\nTop 10 states by firm count:")
states = df[c_state].fillna("(unknown)").astype(str).str[-2:]
for st, n in Counter(states).most_common(10):
print(f" {st:<6} {n:>6,}")
# --- 3. Surface firms carrying disclosure events ---------------------
if c_disc:
flagged = df[df[c_disc].fillna(0).astype(str).str.upper().isin(
["TRUE", "1", "Y", "YES"]) | (pd.to_numeric(df[c_disc],
errors="coerce").fillna(0) > 0)]
rate = len(flagged) / max(len(df), 1)
print(f"\nFirms with at least one disclosure event: "
f"{len(flagged):,} ({rate:.1%} of the sample)")
Two practical notes apply. First, the disclosure tally in the script is a coarse first pass: it counts firms with any disclosure flag, which is a blunt instrument. A rigorous analysis must read the individual disclosure events—separating a single decades-old recordkeeping censure from an active pattern of customer awards—and should weight by the category and recency of the disclosure rather than treating all disclosures as equal, for the reasons the caveats section develops. Second, for any work beyond a small sample the right path is FINRA's sanctioned data access rather than aggressive pagination of the public site: confirm whether a licensed or bulk channel exists for the use case, because scraping the investor-facing tool at scale both strains a public service and runs against FINRA's terms. The script demonstrates the shape of the analysis on a polite sample; it is not a license to download the whole registry by brute force.
Limitations and analytical caveats
The BrokerCheck firms record is the authoritative public registry of US broker-dealers, but it carries structural limitations that an analyst must internalize before drawing conclusions—or, worse, before acting on a disclosure.
A disclosure is not a finding. The most important caveat is that the presence of a disclosure does not establish that the firm did anything wrong. A customer-dispute disclosure records that a customer made an allegation—not that the allegation was proven; many are settled without any admission, and some are denied or dismissed. Even a regulatory action varies enormously in gravity, from a minor technical censure to a fraud-driven expulsion. Treating a raw disclosure count as a misconduct score, or reading any single disclosure as a verdict, over-reads what the field can bear. The category, the outcome, the amount, and the recency of each disclosure all matter, and they live in the individual disclosure events, not in the summary count.
The data is self-reported, with the gaps that implies.Firms and brokers report their own disclosures through the CRD under FINRA's rules. The system is backstopped by enforcement—failing to disclose a reportable event is itself a serious violation—but the reporting obligation runs through the regulated parties, and underreporting and reporting lag are real. An event that should be disclosed may not yet appear, or may be characterized in the most favorable available terms. The registry is authoritative for what has been reported and recorded; it is not a guarantee that everything reportable has been reported.
BrokerCheck is an active-focused snapshot, and bulk access is limited. The public tool is oriented toward the investor's question—is this firm one I can do business with now?—so it foregrounds currently registered firms and a defined window of historical disclosure. Records of firms that left the industry long ago, and the full historical depth of some disclosures, may be abbreviated or absent from the public view relative to the complete CRD. And because BrokerCheck is a public investor tool rather than a bulk-data product, programmatic mass extraction is restricted by FINRA's terms; a longitudinal study of the entire industry over decades cannot simply be scraped from the site and must use FINRA's sanctioned channels and, where appropriate, captured snapshots over time.
It is an SRO record with defined edges. Because FINRA is a self-regulatory organization rather than a federal agency, the registry is scoped to FINRA's registration system. It is comprehensive for the retail broker-dealer industry, and it pulls in disclosures of actions by the SEC and state regulators, but it is organized around FINRA membership, and the parallel investment-adviser registration system (IAPD) and the actions of regulators outside FINRA's frame are only partly reflected. An analyst who needs the full regulatory picture of a financial professional or firm must read the broker-dealer record alongside the adviser record and the relevant SEC and state-regulator sources rather than assuming BrokerCheck is the whole story.
Held with these caveats in mind, the finra_firms table is a uniquely valuable resource: a CRD-keyed, status- and scope-tagged, disclosure-scored registry of every broker-dealer in the United States—the public record, maintained by the industry's self-regulatory organization under SEC oversight, that lets an investor or an analyst answer the question that precedes every securities transaction: who, exactly, is on the other side of this account, and what does their history say about whether to trust them?
Related writing
SEC EDGAR Company Registry: The Federal Index That Resolves Every Public Company — The SEC sits above FINRA in the securities-regulation hierarchy, and its EDGAR company registry is the companion identity layer: where the CRD number resolves a broker-dealer, the CIK resolves a public company, and serious securities analysis joins the two.
SEC Form 4 Insider Trading: The Federal Database Behind Corporate Insider Stock Transactions — Another SEC disclosure regime keyed on named parties and exact identifiers, Form 4 rewards the same careful entity resolution by identifier rather than name that the CRD number enables for broker-dealers and their representatives.
Compliance Screening Across 30+ Federal Enforcement Lists: How the Risk Score Works — The BrokerCheck disclosure history slots directly into multi-list counterparty screening, where a firm's and its brokers' regulatory and customer-dispute record becomes one input among many to an aggregate risk score.