Skip to content

alandaux/Text-Field-Data-Cleaning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Text-Field-Data-Cleaning

Data Cleaning for Text Fields in Aggregate Analyses Using dplyr

Code developed by Arielle Landau

This R. Script is designed to clean text fields in large datasets. The code is specifically centered around shorter text fields, like names, titles, etc., with misspellings, weird capitalizations, and other common mistakes that occur when humans enter data. After cleaning the data, formally messy text fields can now be matched and counted in aggregate analyses.

Use Cases:

  • Counting the number of times different corporations are involved in environmental conflicts with user entered data from the Environmental Justice Atlas

Potential Use Cases:

  • Keeping track of donor involvement in non-profits despite name misspellings
  • Standardizing answers to surveys with text fields

About

Data Cleaning for Text Fields in Aggregate Analyses

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages