The Toolbox, Volume 1: The Sample Size Calculator

August 2, 2017September 24, 2020 • Articles

Many individuals that find themselves managing labor programs do not have formal training in engineering concepts that are incredibly helpful to ensuring their success and the success of their company. The Toolbox looks to cover one of these concepts each month, providing useful instruction, templates, and tools that you can put into practice.

This month’s tool download: The Sample Size Calculator

Welcome to the first installment of a monthly Logile offering, The Toolbox. Through my years working with individuals leading workforce managing programs I have come to realize that many of them have risen through their organizations to ascend into these roles, gaining deep experience about their business, industry and customers along the way. However, in many cases these individuals never received formal training in useful concepts, tools, and approaches as part of their development that can greatly assist them in different facets of their current role. The purpose of this regular offering is to provide you with training and tools that you can put into practice right away to achieve better results in your labor management programs.

So, before we dive into the theory or the tool itself this month, let’s discuss a typical challenge posed to those working in labor management programs. Your company is considering a change to a standard operating procedure. Maybe it is introducing new technology to enhance the customer experience; a new marketing strategy, or just changing something for the sake of changing it (we have all been there). In an organization where the labor management team has been integrated into evaluating potential changes to operations (and if you haven’t, it is time that the leader of your group speaks up), you may be tasked with evaluating the impact of making this type of change. The company has set up a pilot program in a location and you have traveled to observe the new process. And now what?

If you and your organization use a predetermined time and motion system such as MOST, the answer may be simple (and if you do not, please feel free to reach out to learn about the benefits). You observe the process, write your method descriptions, develop your sequence models, and calculate the time for the overall process, later performing some form of extrapolation across the organization to determine the impact of the potential change.

However, what do you do if you do not use a predetermined time and motion system? Furthermore, systems like MOST are only useful when there is motion to study. What if you are trying to understand the impact of a change not concerned with motion, such as a machine processing time or an interaction between a customer and an associate demonstrating a new product? The answer is that you need to perform a time study.

Assuming that you understand the proper approach to designing and conducting a time study, the question still remains – how many times must you observe and measure the process with a stopwatch? The correct answer is, as many times as necessary to achieve the acceptable statistical accuracy prescribed by your organization for such data. But what does that mean?

Time study, along with many other forms of data collection, is a sampling process. What this means is that we can assume that our samples are distributed normally across our unknown population (all of the occurrences of this process) average, and unknown variance.[i] The number of samples that you will need to collect is dependent upon how large the variance, or difference is between your samples. Without diving too deep into the statistics or theory, we can utilize statistical approaches related to sample populations to arrive at the following equation for calculating the variance based on your observations:

With time study we are almost always dealing with a very small initial sample (we recommend close to 30 initial samples to use in this exercise). Due to this, we must use a t-distribution to estimate confidence intervals (that statistical accuracy prescribed by your organization mentioned above). This yields the following:

Finally, we can solve for n to determine the total number of samples in addition to the initial collection that we need measure:

So now that we’ve concluded our statistics lesson for today, how do we actually use this information?

The first thing that you must do is set up your time study utilizing the proper methodology (i.e., document the entire process, break it down into work elements, define start and end points for each element, etc.). Once you have done that, you must collect an initial sampling of times. We recommend collecting 30 initial time samples. Once we have this data, all we need is to determine the desired accuracy and start using the provided tool.

This accuracy is expressed in the t-distribution table as Probability (P), which refers to the sum of the two tail areas (right and left) of our normal distribution. Basically, we are defining the odds that any sample falls in the main portion of our bell-shaped graph (between the tails). As we increase P, or the odds that the sample falls between and not within our tails, we increase the accuracy of our measurement. However, we also increase the number of samples that we must potentially collect to achieve this accuracy. A general best practice, and what Logile recommends, is to require an accuracy of 95 percent, or P = 0.05.

Figure 1 – An example of a normal distribution (the bell shape) with the tails highlighted in yellow. The tails represent the portion of samples that will fall outside of our accepted accuracy. The higher the P value, the smaller the yellow areas and the higher the odds that a sample falls between those yellow areas.

So now that we’ve discussed the statistics that this process is based on, collected our initial samples, and determined our desired accuracy; let’s explore how to use the tool provided in this installment (download link provided at the top of this post).

The instructions are listed in the document, but we will review them quickly here as well. First, take your samples (in seconds) and type them into shaded cells in column B (starting in cell B4). Select the desired confidence interval in cell G13 (set by default to 95 percent). Once you have done these two things, any samples beyond acceptable control limits will be highlighted in red – delete these values. Once you have done this, your required sample size will be presented in cell G16, highlighted in green.

Figure 2 – screen shot of this month’s tool – The Sample Size Calculator

What this tool is doing is performing the equations presented above based on every sample that you enter. Practice using the tool by inputting fabricated values and changing the Confidence selection to see how the calculations and Required Samples values change as you increase or decrease the Confidence, as well as how it changes as the variance between your values changes.

As with many processes related to workforce management like time study, there is a right way to conduct the exercise to ensure that you produce the most accurate results possible. The implications of not calculating the correct sample size are basing something like a labor standard off data that does not truly represent what is going on in your organization. For processes that occur in great volume, such as register transactions for a retailer, a poor standard based off inadequate measurement can result in either millions of dollars of additional, unnecessary annual labor costs, or not adequately staffing to handle your customer volume. Now you have one more tool in your toolbox to ensure that this is done correctly.

[i] Benjamin W. Niebel and Andris Frievalds, Methods, standards, and work design (McGraw Hill, 2003) 393.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
_cfuvid	Session	This cookie is a part of the services provided by Cloudflare - Including load-balancing, deliverance of website content and serving DNS connection for website operators.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the "Analytics" category.
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR Cookie Consent plugin to record the user consent for the cookies in the "Functional" category.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the "Necessary" category.
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the "Other" category.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the "Performance" category.
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
logile_geo_redirected	Session	Records whether the user's browser has been geo-located and redirected to a more suitable Logile sub-site dedicated to that region.
test_cookie	1 day	Used to check if the user's browser supports cookies.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_ga	Session	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_KD0D69SXFL	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_52180569_1	Session	Set by Google to distinguish users.
_gid	Session	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data collected include the number of visitors, their source, and the pages they visit anonymously.
acton/bn/#	Session	A tracking cookie used by the Act-On platform, which tracks engagement with forms on our website.
wp33588	1 year	A tracking cookie used by the Act-On platform, which tracks engagement with forms on our website.

Cookie	Duration	Description
ab	1 day	This cookie is used by the website’s operator in context with multi-variate testing. This is a tool used to combine or change content on the website. This allows the website to find the best variation/edition of the site.
demdex	179 days	Via a unique ID that is used for semantic content analysis, the user's navigation on the website is registered and linked to offline data from surveys and similar registrations to display targeted ads.
dpm	179 days	Sets a unique ID for the visitor, that allows third-party advertisers to target the visitor with relevant advertisement. This pairing service is provided by third-party advertisement hubs, which facilitates real-time bidding for advertisers.
i	1 year	Registers anonymized user data, such as IP address, geographical location, visited websites, and which ads the user has clicked, with the purpose of optimizing ad display based on the user's movement on websites that use the same ad network.
IDE	1 year	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
pixel.gif	Session	Collects information on user preferences and/or interaction with web-campaign content. This is used on CRM-campaign-platform used by website owners for promoting events or products.
ssi	1 year	Registers a unique ID that identifies a returning user's device. The ID is used for targeted ads.
u	1 year	Collects data on user visits to the website, such as what pages have been accessed. The registered data is used to categorize the user's interest and demographic profiles in terms of resales for targeted marketing.
visitorId	1 year	These third-party cookies are used to collect information about companies that visit our website. We use the information to compile reports about companies interested in our website and to help us improve the website. The cookies collect information in a way that does not directly identify anyone. To learn more about ZoomInfo and manage your privacy settings, visit https://www.zoominfo.com/about-zoominfo/privacy-center.
w/1.0/sd	Session	Registers data on visitors such as IP addresses, geographical location and advertisement interaction. This information is used to optimize the advertisement on websites that make use of OpenX.net services.
ziwsSession	Session	Collects statistics on the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been read.
ziwsSessionId	Session	Collects statistics on the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been read.

Corner Convenience: Demand Forecasting to Drive Optimal C-Store Labor Planning, Scheduling and Execution

Retail Scale and Label Management: Experienced Do’s, Don’ts and Proven Best Practices

NRF 2024: Logile’s Top 4 Highlights