Published Data Sources & References
Curated catalog of publicly available federal data sources used across the Use Cases & Applied Analytics documentation. All URLs are real, published resources maintained by U.S. federal agencies or established research institutions.
Federal Open Data Portals
| Source | URL | Description | Formats |
| Data.gov | https://catalog.data.gov | Central U.S. government open data catalog with 300,000+ datasets | CSV, JSON, XML, API |
| DOJ Open Data Portal | https://www.justice.gov/open/open-data | Department of Justice datasets including case filings, grants, and statistics | CSV, JSON, PDF |
| USAspending.gov | https://www.usaspending.gov | Federal spending data — contracts, grants, loans, and direct payments | CSV, JSON, API |
Crime & Law Enforcement
| Source | URL | Description | Formats |
| FBI Crime Data Explorer (CDE) | https://cde.ucr.cjis.gov | NIBRS crime data, hate crime, law enforcement officers killed/assaulted, cargo theft | CSV, JSON, API |
| FBI Uniform Crime Reporting (UCR) | https://www.fbi.gov/how-we-can-help-you/more-fbi-services-and-information/ucr | Legacy crime reporting program (Summary Reporting System) | CSV, PDF |
| Bureau of Justice Statistics (BJS) | https://bjs.ojp.gov | Victimization surveys, corrections, courts, law enforcement statistics | CSV, PDF |
| National Crime Victimization Survey | https://bjs.ojp.gov/data-collection/ncvs | Household survey on criminal victimization including unreported crimes | CSV, SAS |
Federal Sentencing
| Source | URL | Description | Formats |
| U.S. Sentencing Commission (USSC) | https://www.ussc.gov/research | Federal sentencing data, guidelines analysis, annual reports | CSV, PDF |
| USSC Interactive Data Analyzer | https://ida.ussc.gov/analytics/saw.dll?Dashboard | Interactive tool for federal sentencing statistics | Web, Export |
| USSC Annual Reports & Sourcebooks | https://www.ussc.gov/research/annual-reports-and-sourcebooks | Comprehensive annual sentencing statistics | PDF, CSV |
Antitrust & Competition
| Source | URL | Description | Formats |
| DOJ Antitrust Case Filings | https://www.justice.gov/atr/antitrust-case-filings | Index of DOJ Antitrust Division civil and criminal cases | HTML, PDF |
| FTC/DOJ 2023 Merger Guidelines | https://www.justice.gov/atr/2023-merger-guidelines | Current merger review framework including HHI thresholds | PDF |
| HSR Filing Data (Data.gov) | https://catalog.data.gov | Hart-Scott-Rodino premerger notification filing statistics | CSV |
| FTC Competition Cases | https://www.ftc.gov/legal-library/browse/cases-proceedings | FTC enforcement actions, consent decrees, and merger challenges | HTML, PDF |
Drug Enforcement
| Source | URL | Description | Formats |
| DEA Data & Statistics | https://www.dea.gov/data-and-statistics | Drug seizure data, national drug threat assessments, ARCOS | PDF, CSV |
| DEA National Drug Threat Assessment | https://www.dea.gov/documents/2024/05/dea-2024-national-drug-threat-assessment | Annual assessment of drug trafficking trends and threats | PDF |
| ONDCP Drug Policy Data | https://www.whitehouse.gov/ondcp/ | White House Office of National Drug Control Policy datasets | PDF, CSV |
Incarceration & Criminal Justice
| Source | URL | Description | Formats |
| Bureau of Prisons (BOP) Statistics | https://www.bop.gov/about/statistics/ | Federal inmate population, demographics, offense types, facility data | HTML, PDF |
| Vera Institute Incarceration Trends | https://github.com/vera-institute/incarceration-trends | County-level jail and prison incarceration data (1970–present) | CSV, GitHub |
| Prison Policy Initiative | https://www.prisonpolicy.org/data/ | Research and data on mass incarceration in the U.S. | HTML, PDF |
| The Sentencing Project | https://www.sentencingproject.org/research/ | State and federal sentencing policy research and data | PDF, HTML |
Academic & Research Resources
| Source | URL | Description | Formats |
| ICPSR NACJD | https://www.icpsr.umich.edu/web/pages/NACJD/index.html | National Archive of Criminal Justice Data — largest criminal justice data archive | SAS, SPSS, CSV |
| ICPSR General | https://www.icpsr.umich.edu | Inter-university Consortium for Political and Social Research | Multiple |
| National Institute of Justice (NIJ) | https://nij.ojp.gov/funding/data-resources-program | DOJ research arm — grants data and criminal justice research datasets | CSV, PDF |
| Urban Institute Justice Policy Center | https://www.urban.org/policy-centers/justice-policy-center | Research on corrections, policing, courts, and reentry | PDF, CSV |
Microsoft Published White Papers & Resources
Official Microsoft-published white papers, reference architectures, and technical guides relevant to analytics, security, compliance, and fraud detection on Azure and Microsoft Fabric.
Microsoft Fabric & Power BI
| Resource | URL | Description |
| Microsoft Fabric Security White Paper | https://learn.microsoft.com/en-us/fabric/security/white-paper-landing-page | End-to-end security overview for Fabric — data protection, governance, network security |
| Power BI White Papers Collection | https://learn.microsoft.com/en-us/power-bi/guidance/whitepapers | Official index of all Power BI white papers (security, dataflows, deployment, GDPR) |
| Power BI and Dataflows White Paper | https://go.microsoft.com/fwlink/?linkid=2034388&clcid=0x409 | Technical deep-dive on dataflow features and capabilities |
| Advanced Analytics with Power BI | https://info.microsoft.com/advanced-analytics-with-power-bi.html?Is=Website | Predictive analytics, custom visuals, R integration, DAX |
| Medallion Lakehouse Architecture in Fabric | https://learn.microsoft.com/en-us/fabric/onelake/onelake-medallion-lakehouse-architecture | Official guidance on Bronze/Silver/Gold pattern in Microsoft Fabric |
| Resource | URL | Description |
| Cloud-Scale Analytics (Cloud Adoption Framework) | https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/cloud-scale-analytics/ | Enterprise data platform design patterns, data mesh, governance |
| Azure Synapse Analytics Security White Paper | https://learn.microsoft.com/en-us/azure/synapse-analytics/guidance/security-white-paper-introduction | Comprehensive security architecture for analytics workloads |
| Azure Data Factory Documentation & White Papers | https://learn.microsoft.com/en-us/azure/data-factory/ | ETL/ELT pipeline patterns, data integration best practices |
Fraud Detection & Compliance
| Resource | URL | Description |
| Real-Time Fraud Detection with Stream Analytics | https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-real-time-fraud-detection | Step-by-step tutorial: Event Hubs + Stream Analytics + Power BI for fraud detection |
| Azure Real-Time Fraud Detection Reference Architecture | https://github.com/microsoft/azure-real-time-fraud-detection | GitHub reference implementation for real-time fraud analytics on Azure |
Government & Public Safety
| Resource | URL | Description |
| Azure for Government — Justice & Public Safety | https://learn.microsoft.com/en-us/azure/azure-government/documentation-government-overview-jps | Azure Government capabilities for justice, public safety, and law enforcement |
| The Future of Public Safety and Justice (e-book) | https://info.microsoft.com/ww-landing-the-future-of-public-safety-and-justice.html | Microsoft e-book on digital transformation in public safety and justice |
- CSV/JSON: Directly ingestible via PySpark
spark.read.csv() / spark.read.json() in Fabric notebooks - API: Use
requests library in Bronze notebooks; paginate and write to Delta tables - PDF: Requires extraction (tabula-py or manual) before ingestion — suitable for reference documentation rather than automated pipelines
- SAS/SPSS: Convert via
pandas.read_sas() or pyreadstat before loading to lakehouse
Usage in Fabric POC
All data sources listed here can be integrated into the POC using the established patterns:
# Example: Ingest from a federal API
import requests
from pyspark.sql import SparkSession
response = requests.get("https://api.example.gov/data", params={"format": "json"})
data = response.json()
df = spark.createDataFrame(data["results"])
df.write.format("delta").mode("append").saveAsTable("lh_bronze.source_raw")
For detailed implementation patterns, see the individual use case documents.
Last Updated: 2026-04-23