USING GENERATIVE AI FOR CLASSIFICATION OF LEGAL DOCUMENTS
DOI:
https://doi.org/10.66104/4zsfgz56Keywords:
Judicial Efficiency, Legal Technology, Artificial Intelligence in Law, Procedural Automation, Digital JusticeAbstract
The Brazilian judicial system is currently overwhelmed by an enormous backlog of digital lawsuits, making manual case sorting both financially draining and unreliable. This research explores the integration of Generative Artificial Intelligence to streamline the categorization of legal petitions through Large Language Models (LLMs). The study outlines a technical progression divided into three distinct phases. First, a few-shot learning model was tested, resulting in a modest accuracy rate of 56%. Second, the methodology was improved using prompt engineering combined with N-gram analysis and data augmentation strategies to address the issue of skewed datasets. Finally, the research implemented a Retrieval-Augmented Generation (RAG) framework to optimize performance. Using real-world data from the Court of Justice of Tocantins, the experiments demonstrated that the RAG-based system achieved a significant 84% accuracy across 11 complex legal categories. This advanced architecture effectively minimized the occurrence of AI hallucinations and clarified semantic uncertainties often found in legal texts. The findings suggest that this innovative approach provides a reliable and scalable framework for the LegalTech industry, offering a viable path toward modernizing judicial administration. By automating the initial stages of case management, the proposed solution not only enhances operational efficiency but also ensures a higher degree of consistency in the processing of legal documents, ultimately contributing to a more agile and responsive justice system in Brazil and potentially other jurisdictions facing similar digital challenges.
Downloads
References
BENTO, F. M.; TEIVE, R. C. G. Classificação de documentos jurı́dicos utilizando a arquitetura transformer: uma análise comparativa com algoritmos tradicionais de Machine Learning e ChatGPT. Brazilian Journal of Development, v. 9, p. 20208–20224, 2023. DOI: https://doi.org/10.34117/bjdv9n6-97
BROWN, T. et al. Language models are few-shot learners. Advances in neural information processing systems, 2020.
DEVLIN, J. et al. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). [S.l.]: [s.n.]. 2019. p. 4171–4186.
LEWIS, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems. [S.l.]: [s.n.]. 2020. p. 9459–9474.
MAURITZ, B. J. Automatic classification of legal documents. Master’s thesis, Masarykova univerzita. 2018.
SHUKLA, B. et al. Challenges and issues in legal documents classification. AIP Conference Proceedings. 2023. DOI: https://doi.org/10.1063/5.0161060
TEAM, G. et al. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
VASWANI, A. et al. Attention is all you need. Advances in neural information processing systems 30. 2017. p. 5998–6008.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Ruan Dias Santana, Gabriel Reis Nadler Prata, Marcelo da Silva Lisboa, Silvanete Maria da Silva , Marcelo Lisboa Rocha

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish in this journal agree to the following terms:
Authors retain copyright and grant the journal the right of first publication, with the work simultaneously licensed under the Creative Commons Attribution License, which permits the sharing of the work with proper acknowledgment of authorship and initial publication in this journal;
Authors are authorized to enter into separate, additional agreements for the non-exclusive distribution of the version of the work published in this journal (e.g., posting in an institutional repository or publishing it as a book chapter), provided that authorship and initial publication in this journal are properly acknowledged, and that the work is adapted to the template of the respective repository;
Authors are permitted and encouraged to post and distribute their work online (e.g., in institutional repositories or on their personal websites) at any point before or during the editorial process, as this may lead to productive exchanges and increase the impact and citation of the published work (see The Effect of Open Access);
Authors are responsible for correctly providing their personal information, including name, keywords, abstracts, and other relevant data, thereby defining how they wish to be cited. The journal’s editorial board is not responsible for any errors or inconsistencies in these records.
PRIVACY POLICY
The names and email addresses provided to this journal will be used exclusively for the purposes of this publication and will not be made available for any other purpose or to third parties.
Note: All content of the work is the sole responsibility of the author and the advisor.
