Forensic documentation of epistemic exclusion across 400+ years of American institutional history — and its continuation into artificial intelligence.
Sphinx Analysis is the research and forensic documentation arm of SPHINX Global Enterprises Corp. We build peer-review ready evidence chains tracing the structural mechanisms of exclusion — from Papal Bulls to LLM training data — grounded in primary sources, court records, constitutional text, and documented AI governance failures.
Sphinx Analysis operates four interconnected research programs, each producing documentary evidence in formats accessible to academic journals, legal proceedings, policy institutions, and grant applications. All research is conducted through the SPHINX Method™ multi-LLM orchestration protocol to neutralize single-source bias.
The continuity thesis is not an argument — it is a documented chain of institutional decisions, each building on the legal and structural architecture of the last. Every node in this chain is traceable to primary source documentation, not secondary analysis.
In 1884, representatives of fourteen European nations convened in Berlin to divide the African continent into spheres of colonial influence. Not a single African representative was present at the table where the map of their world was redrawn. The conference produced a legal and structural architecture for extracting value from African land, labor, and knowledge — while maintaining the fiction that this arrangement served the interests of "civilization."
That template is executing again — in the design, training, and deployment of large language models.
The training data that forms the cognitive foundation of GPT-4, Gemini, Claude, and their successors is derived primarily from Common Crawl — a corpus that reflects the demographic composition of web content creators: Western, English-language, and educated populations. The communities most affected by AI deployment in education, criminal justice, hiring, and healthcare contributed the least to the epistemic architecture they will be governed by.
Less than 4% of AI researchers globally are Black. The annotation workforce — the human beings who train AI systems to recognize what is appropriate, harmful, and correct — is composed largely of workers in Kenya, the Philippines, and India earning between $1.32 and $2.00 per hour, according to the TIME investigation of OpenAI's content moderation practices. These workers are the cognitive laborers of the AI economy. They receive no equity stake, no governance voice, and no representation in the systems their labor trains.
Carter G. Woodson identified this mechanism in 1933 in The Mis-Education of the Negro. He documented how primary and secondary education shaped the self-concept of Black children before critical thinking tools could develop — centering European epistemology while marginalizing African and African American intellectual tradition. The mechanism he described is not corrected in LLMs. It is automated and scaled.
Digital Apartheid is not a metaphor. It is a structural description of documented conditions: governance without representation, value extraction without equity, epistemic architecture without the participation of those it most governs. The Berlin Conference produced the physical boundaries that divided Africa. LLM training and deployment is producing the epistemic boundaries that will divide the digital world — with the same absence of the people most affected from the table where those boundaries are drawn.
This research program documents those conditions forensically — through AI model cards, training data composition studies, governance board demographics, annotation labor investigations, and the comparison of LLM outputs against verifiable historical record — and builds the evidentiary foundation for academic, legal, and policy response.
Research documents are available to academic institutions, policy organizations, legal practitioners, journalists, and grant program reviewers. All access requests are reviewed and responded to within 5 business days.