articles - Stringfest Analytics https://stringfestanalytics.com Analytics & AI for Modern Excel Sun, 14 Aug 2022 01:02:05 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://i0.wp.com/stringfestanalytics.com/wp-content/uploads/2020/05/cropped-RGB-SEAL-LOGO-STRINGFEST-01.png?fit=32%2C32&ssl=1 articles - Stringfest Analytics https://stringfestanalytics.com 32 32 98759290 Broman & Woo’s “Data Organization in Spreadsheets” https://stringfestanalytics.com/broman-woo-data-organization-in-spreadsheets/ Sun, 14 Aug 2022 01:02:04 +0000 https://stringfestanalytics.com/?p=9693 Earlier this week I posted a new article in the journal Nature titled “Six tips for better spreadsheets.” Today I’d like to share a 2018 article in The American Statistician along the same (grid)lines… “Data Organization in Spreadsheets.” You’ll find repeated themes here, such as the principles of consistency, keeping a raw data source, and […]

The post Broman & Woo’s “Data Organization in Spreadsheets” first appeared on Stringfest Analytics.

]]>
Earlier this week I posted a new article in the journal Nature titled “Six tips for better spreadsheets.”

Today I’d like to share a 2018 article in The American Statistician along the same (grid)lines… “Data Organization in Spreadsheets.” You’ll find repeated themes here, such as the principles of consistency, keeping a raw data source, and so forth.

This article goes into a bit more depth with some important topics like dates and missing values. Each concept:

  • Be consistent
  • Choose good names for things
  • Write dates as YYYY-MM-DD
  • No empty cells
  • Put just one thing in a cell
  • Make it a rectangle
  • Create a data dictionary
  • No calculations in the raw data filters
  • Do not use font color or highlighting as data
  • Make backups
  • Use data validation to avoid errors
  • Save the data in plain text files

Similar to the Nature article, I noticed that Power Query enforces many of these good habits such as keeping a raw data source and using a consistent value for nulls.

My personal favorite rule is to “make it a rectangle.” The article does a great job showing how data should be laid out in a spreadsheet for ease of storage and analysis. To the untrained eye, it looks repetitive and clunky! But the analyst knows a long and narrow rectangle beats a short and wide layered square.

What do you think of these principles and the article? Anything to add to this list?

The post Broman & Woo’s “Data Organization in Spreadsheets” first appeared on Stringfest Analytics.

]]>
9693