Legal artificial intelligence under empirical and epistemic scrutiny

Oscar A. Muñoz

doi:10.5281/zenodo.15959496

Authors

Oscar A. Muñoz Portoparques EP, Portoviejo, Ecuador Author https://orcid.org/0009-0003-3767-2687

DOI:

https://doi.org/10.5281/zenodo.15959496

Keywords:

artificial intelligence, law, legal hallucinations, epistemic injustice, automated verification

Abstract

This study critically examined the phenomenon of legal hallucinations generated by artificial intelligence systems used in legal contexts. The main objective was twofold: to quantify the frequency and types of errors produced by general-purpose and specialized models, and to analyze the ethical and epistemic implications of these failures. A quasi-experimental comparative design was adopted, using a corpus of 200 legal scenarios structured according to the IRAC method. Four artificial intelligence systems were evaluated: two general-purpose language models (ChatGPT 4 and Llama 2) and two specialized legal tools with augmented information retrieval (Lexis+ AI and Westlaw AI). Data collection included manual coding by legal experts and automated analysis using semantic entropy and semantic entropy probes. The results revealed that general-purpose models exhibited significantly higher rates of hallucinations, with fabricated legal citations being the most frequent error. The automated detection system achieved an acceptable accuracy in identifying inconsistencies, with performance metrics aligning well with those of human coding. These failures represent not only a technical risk but also an emerging form of epistemic injustice, as they compromise access to verified information and undermine trust in legal knowledge. It was concluded that epistemic validation mechanisms must be incorporated into legal artificial intelligence systems, and regulatory frameworks should be developed to ensure the responsible use of these technologies in forensic and academic practice.

Downloads

Download data is not yet available.

References

Bench-Capon, T., Prakken, H., & Sartor, G. (2022). Artificial intelligence and legal reasoning: Past, present and future. Artificial Intelligence, 303, 103644. https://doi.org/10.1016/j.artint.2021.103644

Dahl, M., Magesh, V., Suzgun, M., & Ho, D. E. (2024). Large legal fictions: Profiling legal hallucinations in large language models. Journal of Legal Analysis, 16(1), 64–93. https://doi.org/10.1093/jla/laae003

Farquhar, S., Kossen, J., Kuhn, L., & Gal, Y. (2024). Detecting hallucinations in large language models using semantic entropy. Nature, 630(8017), 625–630. https://doi.org/10.1038/s41586-024-07421-0

Fricker, M. (2007). Epistemic injustice: Power and the ethics of knowing. Oxford University Press.

Kay, J., Kasirzadeh, A., & Mohamed, S. (2024). Epis temic injustice in generative AI. arXiv. https://doi.org/10.48550/arXiv.2408.11441

Kossen, J., Han, J., Razzak, M., Schut, L., Malik, S., & Gal, Y. (2024). Semantic entropy probes: Robust and che ap hallucination detection in LLMs. arXiv. https://doi.org/10.48550/arXiv.2406.15927

Langton, R. (2010). Epistemic injustice: Power and the ethics of knowing. https://www.jstor.org/stable/40602716

Latif, Y. A. (2025). Hallucinations in large language models and their influence on legal reasoning: Examining therisks of AI-generated factual inaccuracies in judicial processes. Journal of Computational Intelligence, Machine Reasoning, and Decision-Making, 10(2), 10–20. https://morphpublishing.com/index.php/JCIMRD/article/view/2025-02-07

Magesh, V., Surani, F., Dahl, M., Suzgun, M., Manning, C. D., & Ho, D. E. (2024). HallucinationFree? Assessing the reliability of leading AI legal research tools. arXiv. https://doi.org/10.48550/arXiv.2405.20362

Mollema, W. J. T. (2024). A taxonomy of epistemic injustice in the context of AI and the case for generative hermeneutical erasure. PhilPapers. http://philpapers.org/archive/MOLATO-5

Surden, H. (2018). Artificial intelligence and law: An overview. Georgia State University Law Review, 35, 1305. https://heinonline.org/HOL/LandingPage?handle=hein.journals/gslr35&div=59&id=&page=

Taimur, A. (2025). Manipulative phantoms in the machine: A legal examination of large language model hallucinations on human opinion formation. En IFIP International Summer School on Privacy and Identity Management (pp. 59–77). Springer. https://doi.org/10.1007/978-3-031-91054-8_3

UNESCO. (2021). Recommendation on the ethics of artificial intelligence. https://unesdoc.unesco.org/ark:/48223/pf0000381137