data - Stringfest Analytics https://stringfestanalytics.com Analytics & AI for Modern Excel Mon, 12 May 2025 16:27:13 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 https://i0.wp.com/stringfestanalytics.com/wp-content/uploads/2020/05/cropped-RGB-SEAL-LOGO-STRINGFEST-01.png?fit=32%2C32&ssl=1 data - Stringfest Analytics https://stringfestanalytics.com 32 32 98759290 Python in Excel: How to create custom bins and frequency tables https://stringfestanalytics.com/python-in-excel-how-to-create-custom-bins-and-frequency-tables/ Mon, 12 May 2025 16:27:09 +0000 https://stringfestanalytics.com/?p=15292 If you’ve ever spent time manually grouping data into custom ranges or bins in Excel, you know it can get tedious fast. Whether it’s categorizing sales by price tiers, analyzing customer segments, or bucketing data for visualization, Excel often requires multiple nested formulas, helper columns, and PivotTables. Fortunately, Python offers a simpler, cleaner alternative. In […]

The post Python in Excel: How to create custom bins and frequency tables first appeared on Stringfest Analytics.

]]>
If you’ve ever spent time manually grouping data into custom ranges or bins in Excel, you know it can get tedious fast. Whether it’s categorizing sales by price tiers, analyzing customer segments, or bucketing data for visualization, Excel often requires multiple nested formulas, helper columns, and PivotTables. Fortunately, Python offers a simpler, cleaner alternative.

In this post, I’ll show you how to effortlessly create custom frequency bins on the Windsor housing dataset using Python and Pandas, highlighting quick wins that’ll save you time and hassle compared to traditional Excel methods.

You can follow along with the exercise file below:

 

The first step will be to read our data source into Python in Excel as housing_df.

housing_df = xl("housing[#All]", headers=True)

# Create custom intervals
bins = [0, 40000, 75000, 125000, float('inf')]
labels = ['Budget (≤40k)', 'Affordable (40–75k)', 'Midrange (75–125k)', 'High-end (>125k)']

Next, we’ll define custom price ranges or “bins” using the bins list: bins = [0, 40000, 75000, 125000, float('inf')]. These values represent the price cutoff points for categorizing houses. Notice the float('inf') part? That simply means “infinity,” effectively capturing all prices greater than $150,000. Finally, the labels Budget, Midrange, Premium, and Luxury make your groups easy to read, just like creating labels in an Excel lookup table.

Our next steps will be to create a column in the DataFrame with these labeled values and count up the results:

housing_df['PriceBin'] = pd.cut(housing_df['price'], bins=bins, labels=labels)

# Get counts per bin
price_counts = housing_df['PriceBin'].value_counts().sort_index()
price_counts

The first line takes the price column from our dataset (housing_df) and groups each home into our previously defined price ranges using pd.cut(). Next, we use value_counts() to quickly count how many homes fall into each price category. Sorting by index with sort_index() ensures the counts are displayed in the order of the price categories, from lowest to highest.

The result, price_counts, neatly summarizes the number of homes within each pricing bin:

Custom price intervals start

A nice step from here is to visualize the results. We’ll use price_counts.plot(kind='barh') to take the frequency counts of homes per price bin and create a horizontal bar chart. The subsequent lines (plt.title(), plt.xlabel(), and plt.ylabel()) add clear labels and a descriptive title, making the visualization easy to understand at a glance.

Price bin horizontal bar

To see another practical use of Pandas binning, let’s take our lotsize column and quickly split it into four quartiles using pd.qcut(). This function automatically divides the data into four groups with approximately equal numbers of observations, labeling them ‘Q1’ through ‘Q4’:

housing_df['LotsizeQuartile'] = pd.qcut(
    housing_df['lotsize'],
    q=4,
    labels=['Q1','Q2','Q3','Q4']
)

housing_df['LotsizeQuartile'].value_counts().sort_index()
Quartile count

This code creates a new column LotsizeQuartile in our dataset, assigning each home’s lot size into one of the quartiles. The value_counts() method then counts how many homes fall into each quartile. While the quartiles aim to have an equal number of entries, slight differences can occur if multiple homes share identical lot sizes at the boundaries between quartiles, causing quartile bins to adjust slightly to accommodate ties.

Finally, let’s compare how these two custom categories relate to each other by creating a two-way frequency table using pd.crosstab():

pd.crosstab(housing_df['PriceBin'], housing_df['LotsizeQuartile'])
Crosstab lotsize and pricebin

This two-way frequency table instantly reveals how pricing categories align with property sizes. For instance, you can quickly spot trends like most affordable homes having smaller lots (Q1 and Q2), or higher-end homes primarily appearing in larger lot quartiles (Q3 and Q4). Generating this cross-tabulation in Python is straightforward and automatic: no manual drag-and-drop PivotTable or custom lookup tables required.

If you enjoyed seeing how Python can simplify your Excel workflow and want more practical tips to boost your productivity, check out my mini-course “Python in Excel: Quick Wins” on Gumroad. It’s designed specifically for Excel users eager to leverage Python for immediate, real-world improvements, no programming experience needed. Get your copy now.

What questions do you have about creating custom frequency intervals in Python in Excel or about Python in Excel more generally? Let me know in the comments.

The post Python in Excel: How to create custom bins and frequency tables first appeared on Stringfest Analytics.

]]>
15292
Ten ways to destroy your data analytics solopreneurship https://stringfestanalytics.com/ten-ways-to-destroy-your-data-analytics-solopreneurship/ Sat, 02 Dec 2023 22:27:36 +0000 https://stringfestanalytics.com/?p=12251 I have been running my own data analytics business for over five years. Throughout this journey, I’ve encountered numerous highs and lows. The risk of failure still looms, but that’s just a part of the territory when you’re working for yourself in data analytics. If you’re venturing into this field and looking for some reason […]

The post Ten ways to destroy your data analytics solopreneurship first appeared on Stringfest Analytics.

]]>
I have been running my own data analytics business for over five years. Throughout this journey, I’ve encountered numerous highs and lows. The risk of failure still looms, but that’s just a part of the territory when you’re working for yourself in data analytics. If you’re venturing into this field and looking for some reason to accelerate your downfall, here are ten straightforward ways to do so.

It’s worth noting that many of these methods are likely applicable across various industries and in the context of starting other small businesses. However, my expertise lies in data analytics, so I’ll be framing my insights from this specific perspective.

So, let’s dive in, in no particular order. Keep in mind that some of these points may be interconnected, while others might even seem contradictory. Such is the complex and intertwined life of a solopreneur.

1. Taking excessively small contracts

Consider the complexities involved in acquiring a full-time job: enduring rigorous interviews, submitting numerous files and projects, providing certifications, completing tax filings, and setting up bank accounts. And this is just for one job!

When self-employed, these steps are replicated for multiple contracts, often with even greater overhead. As a solopreneur, you must handle additional tasks like creating invoices, managing your own benefits, and more. The administrative burden can make it disproportionately challenging to justify taking on a client for a small-scale project.

Given the high search and setup costs, it becomes nearly impractical to engage in work that yields only a few thousand dollars. This might seem surprising, but it’s a realistic assessment of the situation.

Therefore, my advice for those considering new clients is straightforward: don’t accept projects worth less than $5,000. If a potential client can’t offer that much work initially, consider proposing a retainer project or another legal arrangement to ensure the customer’s lifetime value reaches this threshold. The sheer volume of administrative work you undertake as a solopreneur warrants setting this minimum limit.

2. Not vetting potential clients enough

This scenario is closely related to dealing with clients who are unwilling to offer a substantial contract. You may encounter numerous clients who undervalue your services and offer minimal compensation, regardless of your expertise, publications, and accolades. Often, you might be approached by inexperienced individuals, perhaps fresh out of college, who are merely looking to fill a position.

These clients can consume a considerable amount of your time with preliminary discovery calls, giving you false hope as they try to organize their own affairs. They might lead you on, asking you to reserve dates and make commitments, all for a negligible reward.

It’s crucial to avoid falling into this trap. If you allow it, your business could suffer significantly. It may seem logical to entertain every potential client, but this isn’t practical. You must assess their budget early in the conversation. If a client is evasive about their budget, consider it a red flag. Further, if they continue to demand your time and effort without disclosing the budget, that’s an even more significant warning sign.

Your time and energy are valuable, and it’s essential to work with clients who recognize and are willing to pay for your worth.

3. Going above and beyond on scope

Working with a lawyer, or even hearing about their billing practices, can be quite enlightening. Lawyers are known for charging for every aspect of their service, including phone calls, emails, perhaps even the use of office supplies. While such thorough billing might initially seem excessive or even predatory, it becomes more understandable once you run your own business. This approach is not just beneficial; it’s often necessary.

As an employee, going above and beyond by performing tasks outside of your job description is common. The primary goal is to satisfy your boss and fulfill their requests. However, the dynamics change significantly when you’re self-employed. Your responsibility is to adhere strictly to the scope of the contract. Anything beyond that requires careful consideration.

For instance, if a client requests pre-briefing meetings that could be handled via email, or asks you to update slides for a given template or set up new webinar logins for a new system, these requests need to be evaluated against the contract’s scope. As a solopreneur managing multiple clients, engaging in such administrative tasks without compensation can consume a significant amount of your valuable time.

A useful strategy is to consider how an attorney would handle similar situations. How would they charge for these additional tasks? Emulating this approach ensures that you don’t engage in out-of-scope work, administrative or otherwise. Failing to do so can lead to pleasing clients at the expense of your own business’s sustainability.

4. Depending on one client alone

Falling for the allure of relying on a single, lucrative client or a former employer for business is a common pitfall for new solopreneurs. Initially, this arrangement might seem ideal with its attractive pay, flexible hours, and a sense of security. However, this situation is precarious because if that sole contract ends, so does your primary source of income.

If your business model is akin to putting all your “employment eggs in one basket,” you might as well consider traditional full-time employment. Being a full-time employee offers additional benefits such as healthcare, unemployment insurance, and other perks, which are not typically available to self-employed individuals.

To avoid this risky scenario, it’s crucial to diversify your client base. This might mean stepping away from the comfort and financial security provided by that one major client. It’s about venturing out to explore new opportunities, even if it feels like leaving a comfortable nest. This diversification is a key aspect of being a solopreneur. It not only safeguards your business against the loss of a single client but also provides a more stable and sustainable income source in the long run.

5. Chasing the shiny new technology

This particular issue is especially relevant for solopreneurs in the data analytics field: the temptation to chase after the latest and most glamorous technologies. It’s easy to get caught up in wanting to keep up with the tech elite on platforms like LinkedIn or to have impressive, cutting-edge topics to discuss with family during the holidays.

However, it’s crucial to stay focused on the core purpose of your business: meeting the demands of paying clients, not impressing strangers on social media or relatives. The allure of new technologies can be strong, and the pressure to appear as an up-to-date, tech-savvy professional is real. But remember, what matters most is the practical application of your skills to meet client needs.

While it may feel less exciting to build a business around more established tools like Excel instead of immersing yourself in the latest AI advancements through Coursera courses, it’s often the more prudent choice. Real-world businesses tend to lag a few years behind the cutting-edge technologies. Not every company operates like Amazon or Google, and there’s significant value in focusing on fundamental, widely-used technologies.

It’s important to align your services with the actual needs and current technology usage of your clients. Straying too far into the realm of the latest tech trends can distance you from the real market demand. Therefore, embracing the basics and providing services that are in current demand can be a more sustainable and profitable approach for a data analytics solopreneur.

6. Ignoring your authority building

The allure of a lucrative and engaging contract can be strong, but as a solopreneur, it’s crucial not to neglect other professional obligations. While salaried employees may have the luxury of focusing solely on their assigned tasks, those who are self-employed must continuously engage in activities beyond immediate project work. It’s essential to always be nurturing your business pipeline, engaging with potential customers, and enhancing your professional authority.

Engaging in content marketing through blogs, newsletters, podcasts, or other media can sometimes feel like thankless, unpaid work, especially if it doesn’t immediately lead to new business. However, its impact is often more significant than it seems. These efforts are key to establishing and maintaining your reputation as an expert in your field.

Consistency in building your authority through content marketing is vital. If you neglect this aspect of your business, you risk falling into a cycle of “feast or famine,” where you oscillate between being overwhelmingly busy and desperately searching for new work. Regularly contributing valuable content to your industry not only helps in attracting consistent business but also establishes your credibility, making potential clients more likely to trust and engage with you.

Therefore, while it’s tempting to pour all your energy into high-paying projects, remember the importance of continuously marketing yourself and building your brand. This balanced approach is crucial for long-term success and stability as a solopreneur.

7. Underestimating project scope and milestones

Underestimating the time required for a project can have very different consequences for employees compared to solopreneurs. As an employee, if you misjudge a project’s duration, there are usually mechanisms to adjust the timeline, seek help from a project manager, or redistribute the workload. In most cases, you’ll still receive your regular paycheck, even if the project takes longer than anticipated.

However, when you’re self-employed, the stakes are much higher. Your income often directly depends on completing projects or reaching milestones within a specific timeframe. Failing to meet these goals can result in delayed or lost payments. Moreover, if a project consumes more time than expected, it can monopolize your schedule, leaving little room to work on other projects or secure new ones. This can lead to significant financial strain, especially if you’re unable to accurately estimate project scopes and appropriate charges.

To mitigate this risk, it’s highly advisable to keep detailed timesheets for yourself, even though it may seem tedious or contrary to the reasons you chose self-employment. Tracking your time provides crucial data on how long tasks actually take, helping you develop a better understanding of project scopes and improve your estimations over time.

Furthermore, being familiar with the resources required for your projects and striving to standardize them can greatly assist in planning and billing. This knowledge enables you to create more accurate proposals and avoid the pitfalls of underestimating project durations and costs. While it might seem like an additional burden, effective time management and project planning are essential skills for a successful solopreneur, helping you avoid the billing obstacles that can jeopardize your business.

8. Assuming that you have to be viral to be successful

The notion that one must become a viral sensation to succeed in content marketing and authority building is a misconception. Indeed, social media success can often appear opaque and unpredictable. The algorithms governing platforms like LinkedIn, TikTok, or Instagram are probabilistic, meaning there’s always an element of chance in what content becomes popular or goes viral.

However, this uncertainty shouldn’t lead to a fatalistic attitude. Instead, it emphasizes the need for perseverance and resilience. Even if you’re not achieving the level of social media stardom you aspire to, and it seems like others with seemingly inferior content are gaining more traction, it’s crucial to persist.

Success in content marketing and building authority doesn’t necessarily require a massive following. It’s more about finding and nurturing your specific niche and audience. This approach involves producing content that resonates deeply with a particular group of people, rather than trying to appeal to everyone.

Consistency and persistence in sharing your knowledge and expertise can establish you as a trusted voice in your field, even without a vast audience. Over time, your dedicated audience, though it may be smaller, can prove to be more valuable in terms of engagement and conversion than a larger but less invested following.

In summary, focus on quality over quantity, remain committed to your content strategy, and strive to connect authentically with your audience. This approach will help you build a loyal following that appreciates your unique perspective and expertise, regardless of the size of your social media footprint.

9. Thinking success has to look a certain way

Navigating the journey of being a data analytics solopreneur requires a flexible mindset and a willingness to adapt. It’s common to enter the field with grand visions of solving groundbreaking problems for renowned clients, becoming a global speaker, or gaining fame in a specific area like machine learning. While these aspirations are admirable, the reality of your entrepreneurial journey might unfold differently. For example, you might discover that your niche lies in data visualization, even if you initially aimed for a different specialization.

The key is to remain agile and open to where your skills and market demands intersect. Holding onto rigid expectations or getting caught up in comparing your journey to others’ can lead to dissatisfaction. Success in solopreneurship is highly personal and varies from one individual to another. It’s not just about achieving fame or recognition; it’s also about finding fulfillment, financial stability, and a sense of contentment that perhaps wasn’t present in traditional employment.

It’s important to define success on your own terms. For some, success might mean earning more than they spend and enjoying a greater sense of satisfaction than in previous jobs. For others, it might involve achieving specific professional milestones or making a certain impact in their field.

Avoid the trap of “compare and despair.” Instead, focus on your unique journey, celebrate your achievements, and continuously reassess what success means to you. By doing so, you’ll be able to navigate the challenges of solopreneurship with a healthier mindset and a clearer sense of purpose.

10. Thinking that the next big achievement will solve everything

The analytical mindset common among data analytics solopreneurs can sometimes lead to an overly simplistic approach to problem-solving and success. It’s easy to fall into the trap of thinking that there’s a single missing variable that will unlock success. For instance, observing a successful individual who has published a book might lead you to believe that writing and publishing your own book is the key to achieving similar success.

However, this line of thinking overlooks the complexity and multifaceted nature of success. While publishing a book can certainly contribute positively, enhancing your authority and credibility in your field, it is not a standalone solution to all professional challenges. Success is rarely the result of a single action or achievement; it’s usually the culmination of various efforts, strategies, and experiences.

The Pixar movie Soul offers a poignant exploration of this concept. It delves into the idea that obsessively pursuing specific goals or achievements can lead to a never-ending cycle of dissatisfaction. This mindset can be particularly detrimental for solopreneurs, who often juggle multiple roles and responsibilities.

As a solopreneur, it’s vital to find inner peace and contentment in your journey. Constantly chasing after the next big client, project, or accolade without appreciating your current achievements can lead to burnout and a perpetual sense of inadequacy. Success should be measured not only in external accomplishments but also in personal growth, satisfaction, and the balance you maintain in your professional and personal life.

To thrive as a solopreneur, it’s important to adopt a holistic view of success, one that values incremental progress, personal well-being, and the joy of the journey itself, rather than just the destination.

What other killers are out there?

Like I mentioned, these are merely ten potential pitfalls that occurred to me while lounging on my sofa on a lazy Saturday afternoon. Undoubtedly, there are many more. So, what other factors could swiftly undermine analytics solopreneurship? Please share your thoughts in the comments.

Additionally, if you’re considering transitioning to analytics solopreneurship, or seeking guidance on this potentially treacherous path, I’ll be sharing a link to my data analytics career advisory coaching services below.

The post Ten ways to destroy your data analytics solopreneurship first appeared on Stringfest Analytics.

]]>
12251
Expert Roundtable: Python or R for Power BI? (Enterprise DNA) https://stringfestanalytics.com/expert-roundtable-python-or-r-for-power-bi-enterprise-dna/ Wed, 03 Nov 2021 15:27:32 +0000 https://stringfestanalytics.com/?p=8445 If you’ve read my blog for any amount of time, you’ll see I’m a big fan of helping analysts mix, match and augment their analytics stack. Heck, I’ve even written a book on how to learn Python and R as an Excel user. I’m excited that my next project involves R for Power BI users, […]

The post Expert Roundtable: Python or R for Power BI? (Enterprise DNA) first appeared on Stringfest Analytics.

]]>
If you’ve read my blog for any amount of time, you’ll see I’m a big fan of helping analysts mix, match and augment their analytics stack. Heck, I’ve even written a book on how to learn Python and R as an Excel user.

I’m excited that my next project involves R for Power BI users, in partnership with the fantastic Enterprise DNA learning community.

In preparation for this course series, I had a great conversation with Enterprise DNA’s Chief Content Officer Brian Julius and Python for Power BI instructor Gaelim Holland. In this discusion, we reviewed the decision of which language to choose, when in your Power BI journey is the appropriate time to begin that study, where learning SQL plays into this, and more.

Catch the discussion on YouTube: 👇

Watch out for my course to drop very soon. In the meantime, Gaelin has a Python for Power BI Users course available on Udemy.

What questions do you have about incorporating R and/or Python into your Power BI work and analytics toolkit in general? Are there specific scenarios or use cases you’re considering? Let’s talk in the comments.

The post Expert Roundtable: Python or R for Power BI? (Enterprise DNA) first appeared on Stringfest Analytics.

]]>
8445
How to Build Analytics Technical Skills (How to Get an Analytics Job Podcast) https://stringfestanalytics.com/aina-analytics-job-podcast/ Sun, 30 May 2021 17:32:17 +0000 https://georgejmount.com/?p=7460 Thanks everyone for attending this event. A recording is available below. You can learn more about my book Advancing into Analytics: From Excel to Python and R and get the promo code for 30 days FREE reading here. I’m excited to join How to Get an Analytics Job podcast host John David Ariansen on a […]

The post How to Build Analytics Technical Skills (How to Get an Analytics Job Podcast) first appeared on Stringfest Analytics.

]]>
Thanks everyone for attending this event. A recording is available below.

You can learn more about my book Advancing into Analytics: From Excel to Python and R and get the promo code for 30 days FREE reading here.

I’m excited to join How to Get an Analytics Job podcast host John David Ariansen on a live stream Friday 6/4 at 12p Eastern.

You can join on LinkedIn Live or via the YouTube embed below. A recording will be made available there after the event as well.

Some readers may remember I’ve been a guest on John David’s show earlier, to discuss analytics courses.

This show will coincide with my new O’Reilly book, Advancing into Analytics: From Excel to Python and R. We will put technical skills in the broader context of analytics careers.

I hope you can join us LIVE for what should be a fun conversation.

Subscribe to the How to Get an Analytics Job podcast here.

Learn more about my book, including how to read it for FREE, here.

The post How to Build Analytics Technical Skills (How to Get an Analytics Job Podcast) first appeared on Stringfest Analytics.

]]>
7460
My DataCamp course is live: “Survey and Measure Development in R” https://stringfestanalytics.com/datacamp-course-announcement/ Sun, 19 May 2019 21:14:22 +0000 https://georgejmount.com/?p=5131 Very happy to share that my DataCamp course “Survey and Measure Development in R” is now available. Check it out here — the first chapter is free. From the course description: How can we measure something like “brand loyalty?” It’s an obvious measure of interest to marketers, but we can’t quite take a ruler to […]

The post My DataCamp course is live: “Survey and Measure Development in R” first appeared on Stringfest Analytics.

]]>
Very happy to share that my DataCamp course “Survey and Measure Development in R” is now available.

Check it out here — the first chapter is free.

From the course description:

How can we measure something like “brand loyalty?” It’s an obvious measure of interest to marketers, but we can’t quite take a ruler to it. Instead, we can design and analyze a survey to indirectly measure such a so-called “latent construct.” In this course, you’ll learn how to design and analyze a marketing survey to describe and even predict customers’ behavior based on how they rate items on “a scale of 1 to 5.”

You’ll wrangle survey data, conduct exploratory & confirmatory factor analyses, and conduct various survey diagnostics such as checking for reliability and validity.

About DataCamp:

DataCamp provides online data science learning services to 3.3 million users in over 190 countries and 1,000 business customers including Airbnb, Kaiser Permanente and HSBC. Learn more and get started here.

The post My DataCamp course is live: “Survey and Measure Development in R” first appeared on Stringfest Analytics.

]]>
5131
Ceteris Pari-sub https://stringfestanalytics.com/ceteris-pari-sub/ Sun, 12 Feb 2017 00:48:02 +0000 http://georgejmount.com/?p=3457 This post has nothing to do with numbers but everything to do with data. Huh? Good analysts don’t just know how to analyze data given to them but how to interpret the context of that data and how to design meaningful ways to get good data. Here in Northeast Ohio, the sandwich chain Subway has been offering $6 footlongs […]

The post Ceteris Pari-sub first appeared on Stringfest Analytics.

]]>
sandwich-608472_960_720

This post has nothing to do with numbers but everything to do with data.

Huh?

Good analysts don’t just know how to analyze data given to them but how to interpret the context of that data and how to design meaningful ways to get good data.

Here in Northeast Ohio, the sandwich chain Subway has been offering $6 footlongs on any sandwich. The sandwiches normally range in price from about $5.50-$8.00.

Why? I asked. Loss leader? Drive traffic in slow winter months? 

Then it hit me. Maybe Subway is running this promotion to collect the valuable asset of all: data.

Ever heard of the ceteris paribus assumption? It roughly translates from Latin as “with all other things being held equal.”

Good analysts find causal relations in their data. And to do that we need to design studies using ceteris paribus.

When we control for all other variables, what changes when we tweak the one variable? 

Maybe this is Subway’s idea. Nothing else has changed. No new menu items, no changes to the size of the sub, no changes to combo prices, etc. They are holding all other things equal while setting all subs to the same price.

Why? Maybe this gives them data on, regardless of price, what customers prefer. With price not an issue, what sandwiches are popular?

Will the simplified price structure work better for Subway? I’ve noticed how flat the pricing structure is at Jimmy John’s, a comparable to Subway. There are $5 subs and $6.50 subs. No smaller portions, no one-price-per-item menu. Chipotle has a similar flat structure, and increasingly so does McDonald’s.

This is just speculation. There may be other reasons for Subway to run this promotion. Whatever it is, this promotion will give the chain very valuable data, and good analysts will look at assumptions, design, and potential courses of action.

Food for thought!

What do you think, analysts? Any other ways Subway could use this data? What is their experimental design here?

The post Ceteris Pari-sub first appeared on Stringfest Analytics.

]]>
3457
The Unified Theory of Analysts https://stringfestanalytics.com/the-unified-theory-of-analysts/ https://stringfestanalytics.com/the-unified-theory-of-analysts/#comments Mon, 13 Apr 2015 20:20:32 +0000 http://georgejmount.com/?p=393 When looking for my first full-time job, I knew I wanted to be an “analyst.”  But there were so many analyst titles – financial business, and systems, to name a few.  How are they all alike and different?  I couldn’t tell. Having worked in various analyst roles for the past three years, I can make […]

The post The Unified Theory of Analysts first appeared on Stringfest Analytics.

]]>
When looking for my first full-time job, I knew I wanted to be an “analyst.”  But there were so many analyst titles – financial business, and systems, to name a few.  How are they all alike and different?  I couldn’t tell.

Having worked in various analyst roles for the past three years, I can make some sense of the analyst landscape.  And now that I’ve been helping college students find their first jobs, I returned to the question.  Does a “unified theory of the analyst role” exist?

I scoured the Internet to find one.  There is very little on the topic.  It makes sense — the job title is so broad as to be almost meaningless (although that is changing — read on.)

The closest thing I found was the BLS Occupational Outlook Handbook.  The BLS has identified seven “analyst” roles.  This is of course not exhaustive — my current title, “business analyst,” is not included, for example.  But it gives an idea of the analyst role’s variety.  Some are heavy on IT, some on financials.

How do these jobs relate?  Being an analyst, I love charts.  So I tried to visualize the analyst family of roles.

I did this with Drew Conway’s data science Venn Diagram.  This chart shows how three disciplines interact to create data science.

This diagram lets me chart where current analyst jobs are versus where they are going.  In light of big data and technology, distinctions between analyst roles are shrinking.  They are moving toward the middle of this diagram.

Most analyst roles are heavy on substantive expertise and light on the other two lenses.  You can see this on this diagram.  Roles like “budget analyst” or “financial analyst” are heavy on spreadsheet-based number-crunching.  But businesses have grown too large and complex for many of today’s spreadsheet practices.  Look at the recent statistic that 88% of spreadsheets contain errors as proof.

While rumors of the spreadsheet’s demise are overstated, these “substance-only” analyst roles need to move closer to the center of this diagram.  Today’s analysts ought to know the basics of data warehousing and visualization.  A lot of this can be done in Excel, but analysts need to “think like a programmer” in spreadsheet design to avoid errors and inefficiencies.

On the flip-side are the IT analysts.  These include roles such as “systems analyst” and “programmer analyst.”  These roles suffer too little substance knowledge.  I can attest to the “danger zone” between substance and hacking — without any domain knowledge, fancy data systems are worthless.  I have spent countless hours talking in circles with IT.  They understand the systems, I understand the substance — and nobody can translate.

Math & Statistics analysts do exist — think of actuaries or other data modellers.  This is the least common analyst role as seen by the absence of a related title from the BLS.  These analyst roles are more likely to lack domain knowledge than hacking skills, One exception is the operations research analyst role (pictured in the diagram).

The difference between these roles is shrinking in a data-driven economy. Regardless of title, the analyst’s role is to solve problems.  These problems usually fell in the camp of “IT problems” or “business problems” – which is why many analyst roles fall safely into one of the data science lenses.  But now, IT problems are business problems, and vice versa.  While the analysts of the past could get by with one set of knowledge in this diagram, it’s now imperative to know all three.  The role I am envisioning is best described as “data analyst” — which is not in the BLS Handbook yet, but it will be.

What does this mean to the analyst job-seeker?  Find the analyst role that suits your talents.  But think about how you can move to the center of this diagram.  For example, if you were a business major, start as a business or financial analyst.  But learn the basics of SQL.  Math geek?  Be a systems analyst, but take a marketing course.

Analysts: what are your thoughts?  Did I explain the analyst family well?  What is its future?  I would love to have more buy-in, as it seems to be a lightly-treaded on the internet.

Data Science Venn Diagram used with permission by Drew Conway.

The post The Unified Theory of Analysts first appeared on Stringfest Analytics.

]]>
https://stringfestanalytics.com/the-unified-theory-of-analysts/feed/ 1 552