reset password
Document Tag Parser & Box.com/eDefender Integration
Sponsored by Santa Barbara Public Defender Office
Joshua Cabrera, Marco De La Torre, Raul Gallegos, Daniel Guevara-Dominguez, Chuang Huang, Dang Le, Jesica Lopez De Leon, Shaocheng Shi, Sergio Tapia, Luke Williams
Advisor: Jungsoo (Soo) Lim
Liaison: Deepak Budwani

Document Tag Parser

Background

The attorneys and paralegals who work for the Santa Barbara Public Defender review a large volume of documents daily. For organizational purposes, they rename each document of evidence based on a sequence of numbers found on the first and last page, known as Bates Stamps. This requires staff to open and check each document manually. Cases often have twenty or more documents, and this process can be time consuming on a large scale. Our objective was to automate this process to allow large batches of documents to be renamed simultaneously. Our application would require a fraction of the original time to complete this frequently performed task. 

Description

Our solution came in the form of a desktop python application. Utilizing the machine learning module pytesseract to convert image documents to text, we were able to create an application that could read and then rename documents automatically. A graphical user interface allows the user to choose specific folders for processing and output of renamed files. The application is multi-threaded to improve performance and overall processing times. Please refer to our poster and documentation for sample images of the application.

 

Box.com and eDefender Integration

Background

Our sponsor, Santa Barbara Public Defender, has transitioned to a fully paperless case mangement system. To support this new project, they are using Box.com as their cloud storage provider to store data. This requires over 50 terabytes of data to be moved to the cloud, and their data needs are increasing exponentially each year. As a part of this process, our team's objective was to assist Santa Barbara Public Defender with the integration of the Box cloud provider with their case management system eDefender. The role of our team was to provide a way for a transcription of case discovery media (video, and audio) to be created and then uploaded to the cloud. This would allow for more efficient review of evidence by their attorneys and paralegals. Due to the complexity of this transition, contracts with Box and Azure are still pending and need to be finalized for this project to be completed. After discussing the options with our sponsor, we agreed to put this part of the project on hold to assist with a separate task, the Document Tag Parser mentioned above.  

Description

The design is based on an application originally created by another CSULA team in 2020. A box skill application is a program which runs everytime a file is uploaded to the Box cloud service. The program is a box skill application written in Node JS, and deployed as a serverless instance in AWS. It utilizes Azure Video Analyzer to transcribe audio as well as video. The video analyzer is able to provide time stamps which use facial recognition to mark the individuals that appear throughout the video. Everytime a video or audio file is uploaded to Box, the application will be triggered to handle the processing, and the transcript will be returned to Box and seamlessly integrated into their existing user interface. Finally, an alert system will be added to the Box Skill application to notify attorneys of new evidence and their corresponding transcriptions.

 

 

 

Role Name email
Faculty Advisor Jungsoo Lim jlim34@calstatela.edu
Project Lead Daniel Guevara-Dominguez dgueva20@calstatela.edu
Document Lead Jesica Lopez De Leon jlopezd3@calstatela.edu
Customer Liaison/Requirements Lead Luke Williams lwillia@calstatela.edu
Architecture/Design Lead Sergio Tapia Stapia11@calstatela.edu
UI Lead Shaocheng Shi sshi5@calstatela.edu
Backend Lead Chuang Huang chuang11@calstatela.edu
Database Schema Lead Marco De La Torre mdelat23@calstatela.edu
QA/QC Lead Joshua Cabrera jcabre83@calstatela.edu
Demo Lead Raul Gallegos rgalle17@calstatela.edu
Presentation Lead Dang Le dle18@calstatela.edu

Meeting Schedule:

  1. Weekly meeting with the advisor: Thursday 6:00 PM
  2. Weekly team meeting: Friday 10:00 AM
  3. Meeting with liaison: Friday 9:00 AM

Link to Presentation Recording: https://cosantabarbara.box.com/s/i4fwfgu5x30pcm9by0gauiuse8e7jzz7

 

Resources
REST+API
PD Discovery Overview
Fall 2021 Presentation
Box and Box Skill Set up - Updated User Manual
Software Requirements Specification (SRS) Version 1.1
Software Design Document (SDD) Version 1.1
Project Poster
Final Presentation Slides - DTP & Box/eDefender Integration
Expo Presentation Recording
[Private Resource] Document Tag Parser Source Code
Document Tag Parser Installer
SDD V2 Spring 2022 - DTP & Box/eDefender
SRS V2 Spring 2022 - DTP & Box/eDefender
Project Report: Box.com/eDefender Integration Document Tag Parser