reset password

Data Preprocessing

Original Data

Data Cleaning

Defining Dropout

Cohort Selection

Code

The code to preprocess the data as described in this section can be checked out as a Maven project from the Subversion repository at svn://sun.calstatela.edu/irp/tools/trunk. To use the code please follow the following steps:

  • Export the original data to text files. Use | as delimiter and no text qualifier.
  • Run irp-create.sql to create the tables in a PostgreSQL database.
  • Run Importer to import the data from the text files to the database tables.
  • Run irp-data.sql to process the data. This step may take a while (about 18 minutes on my computer).

For more details please read the comments in the code.

This page has been viewed 4280 times.