This R. Script is designed to clean text fields in large datasets. The code is specifically centered around shorter text fields, like names, titles, etc., with misspellings, weird capitalizations, and other common mistakes that occur when humans enter data. After cleaning the data, formally messy text fields can now be matched and counted in aggregate analyses.
Use Cases:
- Counting the number of times different corporations are involved in environmental conflicts with user entered data from the Environmental Justice Atlas
Potential Use Cases:
- Keeping track of donor involvement in non-profits despite name misspellings
- Standardizing answers to surveys with text fields