Skip to content

Spreadsheets are a very important data analytics tool. In this part of the course, you will learn about how data analysts use spreadsheets in their work every day. You will also explore why structured thinking helps analysts better understand problems and come up with solutions.

Learning Objectives

  • Discuss the data analyst’s use of spreadsheets with reference to roles and responsibilities
  • Demonstrate the use of spreadsheets to complete basic tasks of the data analyst including entering and organizing data
  • Demonstrate an understanding of the use of formulas in spreadsheets including a definition and specific examples
  • Compare formulas and functions with reference to similarities and differences
  • Describe the key ideas associated with structured thinking including the problem domain, scope of work, and context

Working with spreadsheets


Video: The amazing spreadsheet

Spreadsheets are a versatile tool for data analysts. They can be used to answer data-driven questions, build evidence, visualize data, and support findings. Spreadsheets can also be used to perform both basic and complex calculations automatically. This helps analysts work more efficiently and understand the results of their calculations.

In this part of the program, you will revisit the spreadsheet and learn about some of the functions and formulas that you can use to perform calculations. You will also have the opportunity to work with real data from databases to reorganize a spreadsheet and perform data analysis.

Spreadsheets are a powerful and versatile tool that can be used for a variety of tasks, including data analysis. They are often the first tool that data analysts reach for when trying to answer data-driven questions.

Here are some of the benefits of using spreadsheets for data analysis:

  • Spreadsheets are easy to use. Even if you don’t have a lot of experience with spreadsheets, you can learn how to use them quickly and easily.
  • Spreadsheets are versatile. You can use spreadsheets to perform a wide variety of tasks, including data entry, calculations, and data visualization.
  • Spreadsheets are portable. You can easily save and share spreadsheets, making them a great tool for collaboration.
  • Spreadsheets are affordable. There are many free and open-source spreadsheet applications available.

Here are some of the things you can do with spreadsheets for data analysis:

  • Enter data. Spreadsheets are a great way to enter data. You can easily enter data into a spreadsheet, and the spreadsheet will automatically format the data for you.
  • Calculate data. Spreadsheets can be used to perform a wide variety of calculations. You can use formulas to calculate sums, averages, and other statistics.
  • Visualize data. Spreadsheets can be used to visualize data. You can create charts and graphs to help you understand the data.
  • Present data. Spreadsheets can be used to present data. You can export spreadsheets to PDF or PowerPoint files, or you can share them online.

If you are new to data analysis, spreadsheets are a great place to start. They are easy to use and versatile, making them a valuable tool for any data analyst.

Here are some additional tips for using spreadsheets for data analysis:

  • Use clear and descriptive labels for your data. This will make it easier to understand and interpret your data.
  • Use formulas to automate your calculations. This will save you time and help you avoid errors.
  • Use charts and graphs to visualize your data. This will help you understand the data and communicate your findings to others.
  • Share your spreadsheets with others. This will help you collaborate and get feedback on your work.

Hi, again. I’m glad you’re back. In this part of the program, we’ll revisit the spreadsheet. Spreadsheets are a powerful
and versatile tool, which is why they’re
a big part of pretty much everything
we do as data analysts. There’s a good chance a spreadsheet will
be the first tool you reach for when trying to answer data-driven questions. After you’ve defined what you
need to do with the data, you’ll turn to
spreadsheets to help build evidence that you
can then visualize, and use to support your findings. Spreadsheets are often the unsung heroes
of the data world. They don’t always get the
appreciation they deserve, but as a data detective, you’ll definitely want them in your evidence collection kit. I know spreadsheets have saved the day for me more than once. I’ve added data for purchase
orders into a sheet, setup formulas in one tab, and had the same formulas do the work for me in other tabs. This frees up time for me to work on other things during the day. I couldn’t imagine not
using spreadsheets. Math is a core part of
every data analyst’s job, but not every analyst enjoys it. Luckily, spreadsheets can make calculations more enjoyable, and by that, I mean
easier. Let’s see how. Spreadsheets can do both basic and complex
calculations automatically. Not only does this help you
work more efficiently, but it also lets you see the results and understand
how you got them. Here’s a quick look at some of the functions that you’ll use when performing calculations. Many functions can be used as part of a math formula as well. Functions and formulas
also have other uses, and we’ll take a
look at those too. We’ll take things one
step further with exercises that use real
data from databases. This is your chance to
reorganize a spreadsheet, do some actual data analysis, and have some fun with data.

Video: Get to work with spreadsheets

Data analysts use spreadsheets to organize and analyze data. They can use spreadsheets to create pivot tables, filter data, and perform calculations. Spreadsheets are a versatile tool that can be used for a variety of tasks.

In the video, you will learn about the following ways data analysts use spreadsheets:

  • To analyze data from a construction company’s expenses.
  • To create a pivot table to organize the data.
  • To filter the data to focus on a specific time frame.
  • To use formulas and functions to perform calculations on the data, such as finding the most expensive construction projects.

You will also have the opportunity to work in your own spreadsheets in the future.

Spreadsheets are a powerful tool for data analysis. They can be used to organize data, perform calculations, and create visualizations.

To get started with spreadsheets for data analysis, you will need to:

  1. Choose a spreadsheet application. There are many different spreadsheet applications available, such as Microsoft Excel, Google Sheets, and LibreOffice Calc. Choose the one that you are most comfortable with.
  2. Learn the basics of spreadsheet navigation. This includes how to move around the spreadsheet, select cells, and enter data.
  3. Learn how to use formulas and functions. Formulas are used to perform calculations on data in cells. Functions are pre-written formulas that can be used to perform common tasks.
  4. Learn how to create charts and graphs. Charts and graphs can be used to visualize data and make it easier to understand.

Once you have learned the basics of spreadsheets, you can start using them for data analysis. Here are some of the things you can do with spreadsheets for data analysis:

  • Organize data. Spreadsheets can be used to organize data in a variety of ways. You can create tables, sort data, and filter data to find the information you need.
  • Perform calculations. Spreadsheets can be used to perform calculations on data. You can use formulas to calculate sums, averages, and other statistics.
  • Create visualizations. Spreadsheets can be used to create charts and graphs to visualize data. This can help you understand the data and communicate your findings to others.

If you are new to data analysis, spreadsheets are a great place to start. They are easy to use and versatile, making them a valuable tool for any data analyst.

Here are some additional tips for getting started with spreadsheets for data analysis:

  • Use clear and descriptive labels for your data. This will make it easier to understand and interpret your data.
  • Use formulas to automate your calculations. This will save you time and help you avoid errors.
  • Use charts and graphs to visualize your data. This will help you understand the data and communicate your findings to others.
  • Share your spreadsheets with others. This will help you collaborate and get feedback on your work.

To perform calculations in a spreadsheet, data analysts use formulas and functions.

True

To perform calculations in a spreadsheet, data analysts use formulas and functions, such as SUM, AVERAGE, and COUNT.

What are the first steps a data analyst takes when working with data in a spreadsheet?

Sort and filter

The first steps a data analyst takes when working with data in a spreadsheet are to sort and filter the data.

Data analysts spend a lot of time
organizing data and performing calculations. Luckily, there’s lots of different
tools to help them do just that, including spreadsheets. In this video we’ll
take a look at some of the ways data analysts use spreadsheets to help them
with their day to day responsibilities. Later, you’ll get to test out some of
these things yourself, but for now, let’s start with a quick look at how
data analysts use spreadsheets to do their jobs. This will change depending on
the work you need to complete. But here’s an overview of
a few of the major tasks. Imagine you work for a construction company. Your company needs your spreadsheet skills
to analyze some data about their expenses, so you access the appropriate data and
add it to your spreadsheet. We won’t cover all the details
of this project right now, but you will get a chance to see lots of
spreadsheet features up close and personal as we move forward. What do you do with the data now
that it’s in your spreadsheet? Again, this will be different for each job, but you might start by organizing your
data with the task you’ve been given. For example,
you might put your data in a pivot table. We’ve talked about pivot tables
before in this course. We’ll cover them in more detail later on, but
for now, just think of them as well organized and very useful tables. Next, you might filter
the data in the pivot table. Sorting and filtering data is
a common part of most jobs. This lets you focus only on the data
you’ll need for your analysis. In our example, maybe you only need
the expenses for a certain time frame, like the last three months.
After you filtered your data, you could perform some calculations
to learn more about it. Maybe you need to find out which
construction projects ended up costing the most money. This is where formulas and
functions are really handy. We’ll talk about them in just a bit, but
formulas and functions are great for doing some quick math, especially once you
run out of fingers and toes to count on. Now you’ve seen some of the ways data
analysts are using spreadsheets in their day to day work for
a lot of different tasks, including organizing their data and making
calculations. Before you know it we’ll have you working in your own spreadsheets.

Reading: Spreadsheets and the data life cycle

Reading

Practice Quiz: Hands-On Activity: Introduction to Google Sheets

Video: Step-by-step in spreadsheets

In this video, you will learn how to use spreadsheets to organize data. Here are the steps:

  1. Open a new spreadsheet.
  2. Give the spreadsheet a descriptive title.
  3. Create a folder on your computer to store your spreadsheets and related files.
  4. Enter your data into the spreadsheet.
  5. Make the column widths wider so that you can see the data clearly.
  6. Format the data attributes (variables) in the first row of the spreadsheet.
  7. Add borders to the data table to make each piece of data more clearly visible.

These steps will help you organize your data and make it easier to analyze.

Step 1: Choose a spreadsheet application.

There are many different spreadsheet applications available, such as Microsoft Excel, Google Sheets, and LibreOffice Calc. Choose the one that you are most comfortable with.

Step 2: Enter your data.

Once you have chosen a spreadsheet application, you can start entering your data. Be sure to label your columns and rows clearly so that you can easily understand your data later.

Step 3: Organize your data.

Once you have entered your data, you can start organizing it. This may involve sorting your data in a particular order, grouping your data together, or creating charts and graphs to visualize your data.

Step 4: Perform calculations.

Spreadsheets can be used to perform calculations on your data. This may involve calculating sums, averages, or other statistics.

Step 5: Analyze your data.

Once you have organized and calculated your data, you can start analyzing it. This may involve looking for trends, patterns, or outliers in your data.

Step 6: Share your findings.

Once you have analyzed your data, you can share your findings with others. This may involve creating a report, presenting your findings to a group, or publishing your findings online.

Here are some additional tips for using spreadsheets for data analysis:

  • Use clear and descriptive labels for your data. This will make it easier to understand and interpret your data.
  • Use formulas to automate your calculations. This will save you time and help you avoid errors.
  • Use charts and graphs to visualize your data. This can help you understand the data and communicate your findings to others.
  • Share your spreadsheets with others. This will help you collaborate and get feedback on your work.

We’ve talked about how
spreadsheets are great for organizing data and
performing calculations. Now, it’s time to get our hands dirty and start building
a real spreadsheet. In this video, I’m going to
demonstrate some basic tasks we know data analysts
use spreadsheets for, including entering
and organizing data. We’ll start with a
step-by-step process to show you some tools to organize
your data in a spreadsheet. Consider these steps the basics. You won’t always have to use them when working with a data set, but if your data is a bit
messy when you get it, these steps can help you
get it ready for analysis. Let’s start by opening
a new spreadsheet. As a data analyst, you might not start with
a blank spreadsheet, but it’s good to know how
to do it, just in case. Start by opening Excel, Google Sheets or whatever spreadsheet software
you’re using, then select a new blank file. The first thing you’ll
want to do when you open a new spreadsheet is give it
a title. Here’s a pro tip. Make your title short, clear, and have it state exactly what the data in the
spreadsheet is about. Trust me, it’ll make searching
for it a lot easier. Creating a folder on your
computer specifically for spreadsheets and related files can also make it
easier to find them. For this spreadsheet, it’s
already saved in our drive. So we’ll open our File
menu to click Move. Then we’ll create a new folder, name it “Population Data,” and move the spreadsheet there. Our spreadsheet now
has a new home. This will save you a lot of unnecessary clicks and headaches when you look for this file. There’s a few different
ways data analysts get data they work with. Depending on the job, you might use data
from an open source, you might be given
data to work with or you might be asked
to find your own data. You’ll experience all of
these later in the program. There’s a lot of open
data sources online, where data is made
available to the public. For example, we’ll use
data from worldbank.org, that’s already in
the spreadsheet. The data shows the population of Latin American and Caribbean
countries from 2010-2019. Let’s open this spreadsheet. Time to get the data
ready for analysis. We’ll start by selecting
the whole sheet and making our columns wider by dragging the boundary
of one of the columns. This will help us see
the data clearly, then we can adjust any individual columns
that need it. You can make columns wider
in other ways as well, but this will work for now. The first row of the
spreadsheet is for data attributes or variables. It’s basically labeling the
type of data in each column. Let’s make the attributes
stand out from the rest of the rows by selecting it
and filling it with color. We’ll also make the labels bold. If we want to add
another data attribute between two of the
other attributes, we can always add a new column. Just click on any cell within a column and use the Insert
menu to add a new one. It will appear next
to the column you originally clicked,
pretty simple. Deleting a column
is just as simple. To delete, right-click in a cell in the column
you want to get rid of. The steps we’re showing
may be different depending on the spreadsheet
program you’re using, but should be pretty similar. Let’s add one more thing to
our data table: borders. This can help you see each
piece of data more clearly. To add borders start by
clicking the Select All button at the top left
corner of your spreadsheet. This is like a magic button
because you can click it whenever you need to make changes to every cell
in your spreadsheet. Then click the Border button in the menu, and choose the
type of borders you want. To keep our spreadsheets uniform, we’ll choose borders
for all cells. Just like that, we’ve
gone from raw to refined. Now our spreadsheet
is filled with data and it’s nice
to look at too. Using these organization
tools before you analyze can help you focus on the data
once you start your analysis. Now that we’ve gone
over some ways spreadsheets can be
used to organize data, you’re ready to start
working on them yourself. Later you’ll learn more
about spreadsheets, including some common
errors and how to fix them.

Reading: Learn more about spreadsheet basics

Overview

Practice Quiz: Test your knowledge on working with spreadsheets

When giving a spreadsheet a title, what are some best practices to follow? Select all that apply.

Fill in the blank: Data analysts can use _____ to highlight the area around cells in order to see spreadsheet data more clearly.

Within a spreadsheet, data analysts use which tools to save time and effort by automating commands? Select all that apply.

Formulas in spreadsheets


Video: Formulas for success

Formulas are equations that can be used to perform calculations in spreadsheets. They are made up of operators, which are symbols that represent mathematical operations such as addition (+), subtraction (-), multiplication (*), and division (/). Formulas can also include cell references, which are the addresses of cells in the spreadsheet.

To create a formula, you start with an equal sign (=) and then type the formula. For example, to calculate the total sales for the first row of data, you would type the following formula into cell F2:

=B2+C2+D2+E2

This formula tells the spreadsheet to add the values in cells B2, C2, D2, and E2.

You can also use formulas to perform more complex calculations, such as finding the average sales or the percent change in sales between two time periods. For example, to find the average sales for the first row of data, you would type the following formula into cell F3:

=(B2+C2+D2+E2)/4

This formula tells the spreadsheet to add the values in cells B2, C2, D2, and E2, and then divide the total by 4.

Formulas are a powerful tool that can be used to perform a wide variety of calculations in spreadsheets. By learning how to use formulas, you can make your data analysis more efficient and accurate.

Here are some additional tips for using formulas in spreadsheets:

  • Use descriptive cell references to make your formulas easier to read and understand.
  • Use parentheses to group values in formulas and to control the order in which operations are performed.
  • Use the Formula Evaluator to help you troubleshoot errors in your formulas.
  • Copy and paste formulas to save time when entering them into multiple cells.

The video also covers the following topics:

  • How to use cell references in formulas
  • How to copy and paste formulas
  • How to troubleshoot errors in formulas
  • How to use the Percent button to change a value to a percentage
  • How to use the Formula Evaluator

Overall, the video provides a good overview of how to use formulas in spreadsheets. It is a good resource for anyone who wants to learn how to perform calculations in spreadsheets using formulas.

  • Ask the right questions: The first step to successful data analysis is to ask the right questions. What do you want to learn from the data? What are your goals? Once you know what you’re looking for, you can start to think about the data you need to collect and the analysis methods you’ll use.
  • Clean your data: Garbage in, garbage out. This is a fundamental principle of data analysis. Before you can start analyzing your data, you need to make sure it’s clean and free of errors. This may involve removing duplicate data, correcting typos, and filling in missing values.
  • Use the right tools: There are a variety of tools available for data analysis, each with its own strengths and weaknesses. The right tool for you will depend on the type of data you’re working with, the analysis methods you want to use, and your budget.
  • Understand your limitations: No data set is perfect. There will always be some level of uncertainty in your results. It’s important to understand the limitations of your data so that you can interpret your results correctly.
  • Communicate your findings: The final step in data analysis is to communicate your findings to others. This could involve writing a report, giving a presentation, or creating a visualization. The way you communicate your findings will depend on your audience and your goals.

Here are some additional tips for success in data analysis:

  • Be curious and ask questions.
  • Be creative and think outside the box.
  • Be persistent and don’t give up easily.
  • Be open to feedback and be willing to learn.
  • Be ethical and responsible with your data.

In spreadsheets, what is the term for the symbols used in formulas to perform a specific calculation?

Operators

In spreadsheets, the symbols used in a formula to perform a specific calculation are called operators.

So far we’ve covered how to
start a new spreadsheet, enter in data, and make it look refined and ready for
some serious analysis. Now we’ll learn how to perform calculations in your spreadsheet. You may need to
calculate everything from sums to averages, to finding minimum
and maximum amounts. You’ll use calculations for a lot of different
kinds of tasks. In this video, we’ll focus
on learning the basics and then do a little math with some sales data to practice. Let’s talk about formulas first. You might remember that
a formula is a set of instructions that perform
a specific calculation. Basically, formulas can
do the math for you. Now, they don’t only do math, they can do a lot more. Soon you’ll learn
different ways you can use them throughout the data
analysis processes. Formulas are built on operators
which are symbols that name the type of operation or calculation to be performed. For example, a plus sign
is a common operator. The formulas you use
as a data analyst will usually include at
least one operator. Now, let’s talk about math
expressions or equations. These can take a lot
of different forms, but you might be familiar
with them already. 3 minus 1, 15 plus 8 divided
by 2, 846 times 513. These are all examples
of expressions. Is this bringing back
memories of grade school? Well, back in math class, you most likely learned to
complete an expression by including an equal
sign and the solution. It’s slightly different
with spreadsheets. When you create a formula using an expression in a spreadsheet, you start the formula
with an equal sign. For example, if we
want to subtract, we type an equal sign
followed by the rest of the expression without any
spaces in the formula. Now let’s try an expression that’s a bit more challenging. We’ll type 31982, then a hyphen for a minus
sign, then 17795. To calculate, we press “Enter.” You’ll most likely use formulas this way
when dealing with large numbers or expressions
with multiple steps. Here are the operators you
will use to complete formulas. The plus sign for addition, the minus or hyphen
for subtraction, the asterisk for multiplication, and the forward
slash for division. The division and
multiplication symbols might be different than
what you’re used to. Small changes, but
important to keep in mind. If you already have data
in your spreadsheet, you can use cell references
in your formulas instead. A cell reference is
a single cell or range of cells in a worksheet that can
be used in a formula. Cell references
contain the letter of the column and the number of
the row where the data is. A range of cells is a
collection of two or more cells. A range can include cells
from the same row or column, or from different columns
and rows collected together. We’ll show you an example
in an upcoming video. Now let’s apply what we just
learned to some sales data. If we want to add
these figures to find the total sales for
the first row of data, you can click “cell F2”. From there, we’ll start
with an equal sign and use the cell references to input
values in your expression. We’re starting with cell
B2 because the year in A2 is not a value we want
to add to the total. Then press “Enter.” Just like that, your total sales has
been calculated for you, but what if you realized one of the values in your
data was wrong? No problem. You can change the
value in any cell using the formula and the total
will update automatically. The great thing about using
cell references is that they also automatically update when a formula is copied
to a new cell. Talk about a time-saver. Instead of entering
the same formula again for every new set
of cell references, just copy the formula using the menu or a keyboard
shortcut like Control plus C. Then paste the formula where
you want to apply it using Control plus V. And presto! The formula updates all the new cells and
values correctly. Now let’s say you also want
it to find the average sales. For this, you create a new
formula in a different cell. To group values in a
formula, use parentheses. This lets your spreadsheet know
which values to calculate together and the order of the
operations to be performed. For example, open parentheses, then B2 plus C2 plus D2 plus E2, and close parentheses,
then divide the value of all of this
by typing slash four. You are adding the values
in the four cells together and then using the slash to
divide the total by four, and just like the last one, we can copy and
paste the formula. Here’s another formula you
can use if you want to find the percent change in sales
between June and July. Once a formula
calculates the value, you can then use the percent button to change the value
to a percentage. When you apply the formula
to the other rows, both the formula and the percent will
automatically update. That doesn’t look like
the right answer. Looks like we’ve got
an error. Don’t worry. Errors can happen at any
stage of data analysis, and that includes when
you’re using spreadsheets. A formula has to be air tight. If there’s something
wrong with one of the cell references,
it won’t work. So what’s our error? Well, we can see that the
value in cell D4 is missing. It might take some
time and research on your part to find the correct
value, but it’s worth it. You want your analysis to
be as accurate as possible. When you do add the value, the formula takes
care of the rest. That was a lot to take in. Thanks for staying with me. You’ll be able to apply what you learned about formulas
here and later in the program to make your analysis more
efficient and your job, a little easier, and soon you’ll work in
your own spreadsheet. Happy spreadsheeting.

Reading: Quick reference: Formulas in spreadsheets

Reading

Video: Spreadsheet errors and fixes

This video covers the most common spreadsheet errors and how to fix them.

  • DIV error: This error occurs when you try to divide by zero or an empty cell. To fix this error, you can use the IFERROR function to insert a custom message, such as “Not applicable”, whenever the formula would result in a DIV error.
  • ERROR: This error occurs when the spreadsheet can’t interpret the formula. This can be caused by a missing comma or other typo. To fix this error, carefully check the formula for any errors.
  • N/A error: This error occurs when the spreadsheet can’t find the data that is being referenced in the formula. This can happen when the data doesn’t exist or when the formula is misspelled. To fix this error, check the data and the formula for any errors.
  • NAME error: This error occurs when the spreadsheet doesn’t recognize the name of a function or another object in the formula. To fix this error, check the spelling of the name and make sure that the object exists.
  • NUM error: This error occurs when the spreadsheet can’t perform the calculation specified by the formula. This can happen when the data in the formula is inconsistent or wrong. To fix this error, check the data for any errors.
  • VALUE error: This error can occur for a variety of reasons, such as when a text value is used in a numeric calculation or when a cell reference is incorrect. To fix this error, check the data and the formula for any errors.
  • REF error: This error occurs when a formula references a cell that has been deleted. To fix this error, change the formula to reference a different cell or range of cells.

The video also provides some tips for troubleshooting spreadsheet errors:

  • Carefully check the formula for any errors.
  • Check the data for any errors.
  • Use the IFERROR function to insert a custom message whenever a formula would result in an error.
  • Use the SUM function and a range of cells instead of adding cell values by direct reference.

This video is a good resource for anyone who wants to learn how to troubleshoot spreadsheet errors.

Spreadsheet errors can be a major headache for data analysts. They can cause incorrect results, which can lead to bad decisions. There are a number of ways to find and fix spreadsheet errors.

Here are some common spreadsheet errors:

  • Formula errors: These errors occur when there is a mistake in a formula. For example, a typo in a cell reference can cause a formula to return an incorrect result.
  • Data entry errors: These errors occur when incorrect data is entered into a spreadsheet. For example, a number may be entered as text, or a date may be entered in the wrong format.
  • Formatting errors: These errors occur when a spreadsheet is formatted incorrectly. For example, a number may be formatted as text, or a date may be formatted as a time.
  • Logical errors: These errors occur when a formula is logically incorrect. For example, a formula may return a true value when it should return a false value.

Here are some ways to find and fix spreadsheet errors:

  • Use a spreadsheet auditing tool: Spreadsheet auditing tools can help you find and fix errors in your spreadsheets. These tools can identify formulas that are returning incorrect results, data that is entered incorrectly, and formatting errors.
  • Use a spreadsheet validation tool: Spreadsheet validation tools can help you prevent errors from occurring in the first place. These tools can be used to validate data entry, formulas, and formatting.
  • Check your work: The best way to find errors is to simply check your work carefully. This includes looking for typos, verifying data entry, and making sure that formulas are correct.

Here are some additional tips for avoiding spreadsheet errors:

  • Use clear and descriptive names for your cells and formulas.
  • Use consistent formatting throughout your spreadsheet.
  • Use a spreadsheet template whenever possible.
  • Use a spreadsheet auditing tool to regularly check for errors.
  • Back up your spreadsheets regularly.

Hi and welcome back. Recently we’ve been
learning about formulas. Sometimes data
analysts encounter a problem with our formulas
and we get an error. We’ve all been there and
it can be frustrating. But there are solutions, that’s what we’re going
to explore in this video. One error you may encounter
is the DIV error. The DIV error happens when a
formula is trying to divide a value in a cell by zero
or by an empty cell. In this spreadsheet, the percentage
Complete values in column C are calculated by dividing the values in the Tasks Completed column by the values in the
Required Tasks column. Notice that column C is already formatted
as a percentage. The DIV error is in
cell C4 because we’re dividing by zero the
value in cell A4. To avoid this problem, we can have this spreadsheet automatically enter
not applicable whenever a cell in column A contains a zero that
would cause the error. To do this, we’ll use
the IFERROR function. If it encounters a DIV error caused by a cell that
contains the zero, the phrase “Not applicable”
will be inserted. We can also copy the formula
to the rest of the cells in column C so it checks for any other cells that
contain a zero. Now let’s move on to ERROR. In Google Sheets, ERROR tells us the formula can’t be interpreted as it is input. This is also known
as a parsing error. Say we want to
tally the number of total tasks in column B and C, we use the SUM function, but the formula
equal sum B2 to B6, C2 to C6 causes an error. Examining it more closely, we see that a comma
is missing between the cell ranges B2
to B6 and C2 to C6. We can fix this by inserting
a comma between the cell ranges to indicate the
end of each data item. This is called a delimiter, which you will learn
more about soon. Now, the formula can correctly calculate the total
number of tasks as 25. Another type of error is N/A. The N/A error tells
you that the data in your formula can’t be
found by the spreadsheet. Generally, this means
the data doesn’t exist. This error most often occurs when using functions
such as VLOOKUP, which searches for a
certain value in a column to return a corresponding
piece of information. Here, we see a master list
of nuts and their prices. Using VLOOKUP, the spreadsheet
finds prices in the list, then calculates the prices for each store using the
assigned markup. But we have a N/A error
in cells B49 and C49. The VLOOKUP formula is correct, so what’s going on? Well, if we look carefully
at the name of the nut, “almond” has no match
in the lookup table, the lookup table uses the
plural “almonds” instead. So we change almond to almonds, and with that typo fixed, the right prices are filled in. Speaking of typos, sometimes a typo can cause a NAME error. A NAME error can happen when a formula’s name isn’t
recognized or understood. Suppose we see a NAME error in the nut prices spreadsheet. If we look carefully, the VLOOKUP function in cell
B21 is spelled incorrectly, it has one extra O; this causes a NAME error for both the price and the resulting markup
calculation for the store. To fix this error, we can delete the
extra O in VLOOKUP. Perfect. Sometimes an error is caused by inconsistent
or wrong data. For instance, the NUM
error tells us that a formula’s calculation can’t be performed as
specified by the data. The data doesn’t make sense
for that calculation. Here’s what I mean. Suppose we’re working on a large construction
project using a spreadsheet to track how many months it takes
to reach key milestones. We can use the
DATEDIF function to calculate the number of months between start and end dates. The function requires
the start date to be in the first cell referenced and the end date to be in the second
cell referenced. In our case, cells B2
and C2 respectively. The M represents months, as we want this spreadsheet
to calculate the number of months between our
start and end dates. But we get a NUM
error in cell D6. We notice that the end date
comes before the start date, so the DATEDIF function can’t calculate the
number of months between. It’s likely the
start and end dates were interchanged by accident. We can request verification
of the data to make sure. In the meantime, let’s
reverse the order of the cells in the formula to temporarily get
around the error. Now, the result is nine months. What if the client’s
name was accidentally inserted into the start
date in the spreadsheet? You guessed it, we get an error. The VALUE error can indicate a problem with a formula
or referenced cells. It’s often not clear right
away what the problem is, so this error might take a
little more effort to fix. In this case, John Welty was
input as the start date, making the calculation
impossible for the DATEDIF function
in the cell D6. We just replace the
text, John Welty, with the correct start date
of September 1st, 2016. Last is the REF error, which often comes
up when cells being referenced in a formula
have been deleted, thus making the formula unable to perform the calculation. Here’s a spreadsheet
used to calculate the number of seats available
for a company lunch. Let’s say the company decided not to run
the second floor, so we delete row 4. This results in a REF error when calculating the total seats
available in cell B5. To fix this, we can
change the formula to add the values in
cells B2 and B3. Also, in this case, we could have prevented the REF error by using
the SUM function and a range of cells
instead of adding the cell value by
direct reference. Now, if we delete row 10, the SUM function
calculates the total seats available. There you go. We’ve now fixed some of the most common
spreadsheet errors. When you see them again, you’ll know what they mean. Troubleshooting is a big
part of data analysis, so being able to find solutions is a key skill for
data analysts.

Reading: More about spreadsheet errors and fixes

Overview

Practice Quiz: Test your knowledge on using formulas in spreadsheets

Which of the following are examples of operators used in formulas? Select all that apply.

In a spreadsheet, a formula should always start with which of the following operators?

What is the term for the set of cells that a data analyst selects to include in a formula?

In a formula, the plus sign (+) is the operator for addition, and the hyphen (-) is the operator for subtraction.

Functions in spreadsheets


Video: Functions 101

A function is a preset command that automatically performs a specific process or task using the data. Functions can be used to simplify calculations and make spreadsheets more efficient. Here are some examples of functions in spreadsheets:

  • SUM: Adds all the numbers in a range of cells.
  • AVERAGE: Calculates the average of all the numbers in a range of cells.
  • MIN: Returns the smallest number in a range of cells.
  • MAX: Returns the largest number in a range of cells.

To use a function in a spreadsheet, you simply type the function name followed by the range of cells that you want to apply the function to. For example, to calculate the total sales for the month of June, you would type the following formula into a cell:

=SUM(B2:C2)

This formula would return the number 100, because the total sales for June are 50+50=100.

Functions can also be used to perform more complex calculations, such as calculating the percent change in sales between two months or finding the lowest monthly sales in a data set.

Functions are a powerful tool that can make spreadsheets more efficient and informative. By learning how to use functions, you can become a more effective data analyst.

Functions 101 in Spreadsheets

Functions are a powerful tool that can make spreadsheets more efficient and informative. They allow you to perform complex calculations and transformations on your data with just a few clicks.

What is a function?

A function is a preset formula that performs a specific task on your data. For example, the SUM function adds all the numbers in a range of cells, while the AVERAGE function calculates the average of all the numbers in a range of cells.

How to use functions

To use a function in a spreadsheet, you simply type the function name followed by the range of cells that you want to apply the function to. For example, to calculate the total sales for the month of June, you would type the following formula into a cell:

=SUM(B2:C2)

This formula would return the number 100, because the total sales for June are 50+50=100.

Some common functions

Here are some of the most common functions that you will use in spreadsheets:

  • SUM: Adds all the numbers in a range of cells.
  • AVERAGE: Calculates the average of all the numbers in a range of cells.
  • MIN: Returns the smallest number in a range of cells.
  • MAX: Returns the largest number in a range of cells.
  • COUNT: Counts the number of cells in a range that contain numbers or text.
  • IF: Performs a conditional calculation based on a given condition.
  • VLOOKUP: Looks up a value in a table and returns the corresponding value from another column in the table.

Nesting functions

You can also nest functions, which means that you can use one function as the argument to another function. For example, the following formula calculates the total sales for June, minus the cost of goods sold:

=SUM(B2:C2) - SUM(D2:E2)

In this formula, the SUM function is nested inside the SUM function. This allows you to perform complex calculations in a single formula.

Tips for using functions

Here are some tips for using functions in spreadsheets:

  • Use the function wizard to help you insert and edit functions. To open the function wizard, click on the Insert Functions button in the formula bar.
  • Be careful when nesting functions. Make sure that you understand the order in which the functions will be evaluated.
  • Use meaningful function names and cell references. This will make your formulas easier to read and maintain.

Conclusion

Functions are a powerful tool that can make spreadsheets more efficient and informative. By learning how to use functions, you can become a more effective data analyst.

Here are some additional tips for learning and using functions in spreadsheets:

  • Start by learning the most common functions, such as SUM, AVERAGE, MIN, MAX, COUNT, IF, and VLOOKUP.
  • Practice using functions on your own data. This is the best way to learn how they work and how to use them effectively.
  • Look for tutorials and online resources that can help you learn more about functions.
  • Don’t be afraid to experiment. The best way to learn is by trying different things.

With a little practice, you will be using functions like a pro in no time!

Formulas are a
great way to become more efficient when
using spreadsheets, especially when you add shortcuts like copying and
pasting, into the mix. As you progress as
a data analyst, you’ll most likely learn more shortcuts to
help your process. But now it’s time to
move on to functions. While they’re closely
related to formulas, they’re not exactly the same. By the end of this video, you’ll understand
the difference and know when to use them both. In the world of spreadsheets a function is a
preset command that automatically performs
a specific process or task using the data. You might remember
some of the shortcuts we learned that can be
used with formulas. Think of functions as the
most useful of the shortcuts. The good news is a lot of spreadsheet functions have names that tell you what they do. There are tons of
functions out there. As you continue to work
with spreadsheets, you’ll find that you
use certain ones a lot, and others, rarely or not at all. For now, let’s take a look at some of the
functions that we can apply to our sales data
from the previous video. We’ll start with total sales. Let’s use the SUM function
for this in cell F2. The first steps are pretty similar to what we did
in the last video. First, we’ll select the cell where we want the
calculation to appear. Type equals, then add the
word SUM as our function. One of the great
things about functions is they don’t always
need operators, like a plus sign for addition. In this case, after
the open parentheses, you can go ahead and select the range of cells you’re adding. A colon between the
cell references shows that you’re using a range. In this case, the range includes
cells from the same row. After the closed
parentheses, we press Enter. Just like that, our total
sales number appears. Just like the formula
we used before, functions can be
copied and pasted into other cells in
the same column. But let’s undo that
step so that you can see another way to copy
a function or formula. Spreadsheets have something
called a fill handle. It’s a little box that appears in the lower right-hand corner
when you click on a cell. If you rest your
cursor on the box, you can then drag
the fill handle to the other boxes in the
same row or column. Any formula or function
in that cell will automatically be added to
the cells you fill plus, the fill handle will
update the formula so the cell references match the row of the columns
of the cells you fill. This means the formula
is calculated based on the data in each
separate row or column. Filling won’t work
for every situation, but it’s still a
pretty great trick. Now let’s find the
average sale for each month using the
AVERAGE function. Different functions perform
different calculations, but they work in the same way. Keep in mind, not
every calculation you’ll come across has its
own function to help you. For example, to find the percent change in sales
between June and July, you’d use the same formula
you used in an earlier video. Let’s say you’re asked to find the lowest monthly
sales in this data set. There’s a function for that. It’s called the MIN function, which stands for minimum. Here’s how it works. Say you need to find the
lowest monthly sales for the whole set. All you have to do is
set up the function. Then after the open parenthesis, select the values
from all three rows. This might be
important information for your stake holders. Let’s add color to the
cell with that value, in your data set to
make it stand out. In this case, click on cell
D2 and then fill color icon, which looks like a paint can, then choose a color. I’ll use yellow here. You can follow the same steps for the highest sales by using the, wait for it, MAX function. Looks like we have
an error message. What could be wrong? We forgot to include an open parentheses
after the function. No worries, it’s a quick fix. But this is a good reminder
to continually check the format of your functions and formulas as you use them. We’ll learn more
about Error messages and how to work with them later. That’s better. Now we’ll add color to the cell with
the highest sales too. This is just one way
to highlight key data. You’ll find out about
some others later. You’ve now had a peek
at some ways you can add and organize
data in a spreadsheet. You’ve also seen how
powerful formulas and functions can be when
applied to real world data. As a data analyst, this is just the beginning of your experience
with spreadsheets. You’ll soon find out how much more spreadsheets
have to offer. In the meantime, you’re free to practice some of these formulas, functions, and other
processes on your own. It can be fun to experiment, and see all that
spreadsheets can do. Soon, you will switch from spreadsheets to
structured thinking. The data analytics pieces are
starting to fit together. Exciting stuff is coming
right up. So stick around.

Which of the following are functions? Select all that apply.

MIN, AVERAGE, SUM

SUM, AVERAGE, and MIN are functions. A function is a preset command that automatically performs a specific process or task using data.

Reading: Quick reference: Functions in spreadsheets

Overview

Practice Quiz: Hands-On Activity: Create a Custom Data Table

Practice Quiz: Test your knowledge on using functions in spreadsheets

Data analysts use which of the following functions to quickly perform calculations in a spreadsheet? Select all that apply.

What is the term for a preset command in a spreadsheet?

You are working with spreadsheet data about a cross-country relay race. Each runner’s times are located in cells H2 through H28. To find the runner with the slowest time, what is the correct function?

Save time with structured thinking


Video: Before solving a problem, understand it

  • Albert Einstein once said that if he had one hour to save the planet, he would spend 59 minutes defining the problem and one minute resolving it. This shows the importance of defining the problem before trying to solve it.
  • A lot of times, teams jump right into data analysis without clearly defining the problem. This can lead to them solving the wrong problem or not having the right data.
  • In this video, we will learn how to develop a structured approach to defining the problem domain. This is the specific area of analysis that encompasses every activity affecting or affected by the problem.
  • Before we can do anything else, we need to understand the problem domain and all of its parts and relationships. This is like putting together a jigsaw puzzle without knowing what the picture is supposed to be.
  • Data analysts face the same challenges. They are not always given the complete picture at the start of a project. A big part of their job is to develop a structured approach and use critical thinking to find the best solution.
  • This starts with understanding the problem domain. We need to train our brains to think structurally in order to successfully solve problems as data analysts.

Here are some key points from the text:

  • Defining the problem is an important first step in data analysis.
  • A structured approach to defining the problem domain can help us to understand the problem better and to find the best solution.
  • Data analysts need to be able to think structurally in order to solve problems.

Introduction

Data analysis is the process of collecting, cleaning, and analyzing data to extract insights. It is a powerful tool that can be used to solve a variety of problems. However, before you can solve a problem with data analysis, you need to understand the problem.

Why is understanding the problem important?

There are a few reasons why understanding the problem is important in data analysis. First, it helps you to identify the right data to collect. If you don’t understand the problem, you may collect the wrong data, which can lead to inaccurate results. Second, understanding the problem helps you to choose the right analytical methods. There are many different analytical methods available, and each one is better suited for solving certain types of problems. Third, understanding the problem helps you to interpret the results of your analysis. If you don’t understand the problem, you may misinterpret the results, which can lead to incorrect conclusions.

How to understand a problem

There are a few steps you can take to understand a problem:

  1. Define the problem. What is the specific problem you are trying to solve? What are the symptoms of the problem? What are the consequences of the problem?
  2. Gather information. What data is available that can help you to understand the problem? Where can you find this data?
  3. Analyze the data. This involves cleaning the data, exploring the data, and identifying patterns in the data.
  4. Develop a hypothesis. Based on your understanding of the data, what is the likely cause of the problem?
  5. Test your hypothesis. This involves collecting more data and conducting further analysis.
  6. Interpret the results. What do the results of your analysis tell you about the problem?

Conclusion

Understanding the problem is an essential first step in data analysis. By taking the time to understand the problem, you can increase the chances of success in your data analysis project.

Here are some additional tips for understanding a problem in data analysis:

  • Talk to the people who are affected by the problem. They can provide valuable insights into the problem and its causes.
  • Brainstorm with a team of people. This can help you to come up with different perspectives on the problem.
  • Use visualization tools. This can help you to see the data in a new way and to identify patterns that you may not have noticed otherwise.
  • Be patient. It takes time to understand a complex problem. Don’t rush the process.

Albert Einstein once
said,” If I were given one hour to
save the planet, I would spend 59 minutes defining the problem and one
minute resolving it.” Now, that might seem extreme, but it does show us just
how important it is to define the problems
before trying to solve them. A lot of times, teams jump right into data
analysis before realizing a few months later
that they are either solving the wrong problem or they don’t have
the right data. In this video, we will
learn how to develop a structured approach to
defining the problem domain. This is important
because if you define the problem clearly
from the start, it’ll be easier to solve, which saves a lot of time,
money, and resources. In the data world, we call this first piece
the problem domain: the specific area of
analysis that encompasses every activity affecting or
affected by the problem. Before we can do anything else, we need to understand the problem domain
and all of its parts and relationships so that we can discover the whole story. Actually calling it the first piece makes me think
of a jigsaw puzzle. Say you have a puzzle. Let’s think of that puzzle
as our problem domain. You have all 500 pieces
but you lost the box. So you don’t know what
image the puzzle will reveal. Will it be an animal? A waterfall? A bowl of oranges? Whatever it is, it’s going
to be tough trying to put it together without an
image you can refer to. Even the greatest puzzler
in the galaxy would need a new process and lots of time to
complete that puzzle. Data analysts face the same kinds of challenges too. You might remember that
data analysts aren’t always given the
complete picture at the start of a project. A big part of their
job is to develop a structured approach and use critical thinking to
find the best solution. That starts with understanding
the problem domain. This is where structured
thinking comes into play. To successfully solve a
problem as a data analyst, you need to train your brain
to think structurally. That’s exactly what you’ll learn coming up. See you there.

Video: Scope of work and structured thinking

Structured thinking is the process of recognizing the current problem or situation, organizing available information, revealing gaps and opportunities, and identifying the options. It is a way to be prepared and to have a clear plan for completing a project or solving a problem.

Structured thinking helps data analysts save time and effort by preventing them from having to redo their work. It also makes their job easier by allowing them to better understand the work they are doing.

One of the starting places for structured thinking is the problem domain, which is the specific area of analysis that encompasses every activity affecting or affected by the problem. Once you know the problem domain, you can set your base and lay out all your requirements and hypotheses before you start investigating.

Another way to practice structured thinking is to use a scope of work (SOW). An SOW is an agreed-upon outline of the work you are going to perform on a project. It should include things like work details, schedules, and reports that the client can expect.

A scope of work can be a simple but powerful tool. With a solid scope of work, you will be able to address any confusion, contradictions, or questions about the data up-front and make sure these setbacks don’t stand in your way.

In the next video, you will learn about the importance of contextualizing data and avoiding bias.

Introduction

A scope of work (SOW) is an agreed-upon outline of the work that will be performed on a project. It is a valuable tool for data analysts because it can help to avoid confusion, contradictions, and questions about the data up-front.

Structured thinking is a process of recognizing the current problem or situation, organizing available information, revealing gaps and opportunities, and identifying the options. It is a valuable tool for data analysts because it can help us to better understand the work we are doing and to avoid mistakes.

How to create a scope of work

To create a scope of work, you will need to:

  1. Define the problem domain. This is the specific area of analysis that you are interested in.
  2. Identify the deliverables. What are the specific products or services that you will be providing?
  3. Set the timeline. When will the work be completed?
  4. Define the milestones. What are the key checkpoints along the way?
  5. Identify the resources. What people, equipment, and materials will be needed?
  6. Establish the budget. How much will the project cost?

How to use structured thinking

To use structured thinking, you will need to:

  1. Define the problem. What is the specific problem that you are trying to solve?
  2. Gather information. What data is available that can help you to understand the problem?
  3. Analyze the data. This involves cleaning the data, exploring the data, and identifying patterns in the data.
  4. Develop a hypothesis. Based on your understanding of the data, what is the likely cause of the problem?
  5. Test your hypothesis. This involves collecting more data and conducting further analysis.
  6. Interpret the results. What do the results of your analysis tell you about the problem?

Conclusion

A scope of work and structured thinking are both valuable tools for data analysts. By using these tools, you can avoid setbacks and ensure that your data analysis projects are successful.

Here are some additional tips for creating a scope of work and using structured thinking:

  • Be clear and concise. The scope of work should be easy to understand by everyone involved in the project.
  • Be realistic. The timeline and budget should be achievable.
  • Be flexible. Things change, so be prepared to adjust the scope of work as needed.
  • Communicate regularly. Keep everyone involved in the project updated on your progress.

What process do data analysts use to recognize the current situation, organize information, and identify options?

Structured thinking

Data analysts use structured thinking to recognize the current situation, organize information, and identify opportunities.

Earlier I told you that carefully defining a business problem can ultimately save time,
money, and resources. All of this is achieved
through structured thinking. Structured thinking
is the process of recognizing the current
problem or situation, organizing available information, revealing gaps and opportunities, and identifying the options. In other words, it’s a way
of being super prepared. It’s having a clear list of what you are expected to deliver, a timeline for major
tasks and activities, and checkpoints so the team
knows you’re making progress. In this video, we’ll look at how structured thinking helps
us save time and effort, but also makes our job
as data analysts easier because it allows us to better understand the work we are doing. In the business world, it’s common for teams
to spend hours of valuable time trying to
solve an important problem, only to end up back
where they started. Not only is the initial
problem not resolved, but they’ve spent hours
not resolving it. This outcome negatively
affects you, your team, and the
organization as a whole. But it can usually be prevented. Many times the situation is a result of not fully
understanding the issue. Structured thinking will help you understand problems
at a high level so that you can identify
areas that need deeper investigation
and understanding. The starting place for structured thinking is
the problem domain, which you might have
remembered from earlier. Once you know the specific
area of analysis, you can set your base and lay out all your requirements and hypotheses before you
start investigating. With a solid base in place, you’ll be ready to deal with
any obstacles that come up. What kind of obstacles? Well, let’s say you’re asked to predict
the future value of an apartment building
based on a given dataset. You have hundreds
of variables and every one is crucial
to your analysis. But what if one variable
accidentally gets left out, like square footage, for example? You’d have to go back and
redo all your hard work. That’s because
missing variables can lead to inaccurate conclusions. Another way that you can
practice structured thinking and avoid mistakes is by
using a scope of work. A scope of work or
SOW is an agreed- upon outline of the work you’re going to
perform on a project. For many businesses, this includes things
like work details, schedules, and reports that
the client can expect. Now, as a data analyst, your scope of work
will be a bit more technical and include those basic items
we just mentioned, but you’ll also focus on things like data preparation,
validation, analysis of quantitative
and qualitative datasets, initial results, and maybe even some visuals to really
get the point across. Let’s bring a scope of work to life with a simple example. Say a couple has hired
a wedding planner. We’ll focus on just one task,
the wedding invitations. Here’s what might be in
scope of work: deliverables, timeline, milestones,
and reports. Let’s break down just one
of these, deliverables. The wedding planner and couple will need to
decide on the invitation, make a list of people to invite,
collect their addresses, print the invitations,
address the envelopes, stamp them, and mail them out. Now let’s check
out the timelines. You’ll notice the dates and the milestones which
keep us on track. Finally, we have the reports, which give our couple
some peace of mind by telling them when each
step is complete. A scope of work can be a
simple but powerful tool. With a solid scope of work, you’ll be able to
address any confusion, contradictions, or
questions about the data up- front and make sure these sneaky setbacks
don’t stand in your way. This is a simple example of what a scope of
work might look like. But later, you’ll be able to
practice building your own. Next up in our scope, we’ll check out setbacks from a different angle by learning the importance of contextualizing
data and avoiding bias. Looking forward to sharing
some cool insights with you.

Practice Quiz: Hands-On Activity: Create a scope of work

5 Data Analytics Projects for Beginners

Video: Staying objective

Contextualizing data is important because it allows us to understand the meaning of the data. For example, knowing the date and time that the data was collected can help us to understand why certain trends are occurring. Additionally, knowing who collected the data and how it was collected can help us to identify any potential biases in the data.

Bias in data can occur when the data is collected in a way that is not representative of the population as a whole. For example, if a survey is only given to people who are already interested in a particular topic, the results of the survey will be biased towards that topic.

To avoid bias in data, it is important to start with an accurate representation of the population and to collect the data in the most appropriate and objective way possible.

Here are some tips for contextualizing data:

  • Consider the who, what, where, when, how, and why of the data.
  • Who collected the data?
  • What is it about?
  • What does the data represent in the world, and how does it relate to other data?
  • When was the data collected?
  • Where was the data collected?
  • How was the data collected?
  • Why was the data collected?

By asking yourself these questions, you can better understand the meaning of the data and identify any potential biases.

What is objectivity in data analytics?

Objectivity in data analytics is the ability to interpret data without letting personal biases or prejudices influence the results. This means that data analysts should be aware of their own biases and take steps to mitigate them.

Why is objectivity important in data analytics?

Objectivity is important in data analytics because it helps to ensure that the results are accurate and reliable. When data analysts are not objective, they may unknowingly introduce bias into the results, which can lead to incorrect conclusions.

How to stay objective in data analytics

There are a number of things that data analysts can do to stay objective in their work, including:

  • Be aware of your own biases. The first step to staying objective is to be aware of your own biases. This means being aware of your own personal beliefs, values, and experiences, and how they might influence your interpretation of data.
  • Consider multiple perspectives. When analyzing data, it is important to consider multiple perspectives. This means looking at the data from different angles and considering different interpretations.
  • Use statistical methods to minimize bias. There are a number of statistical methods that can be used to minimize bias in data analysis. These methods can help to ensure that the results are not influenced by the data analyst’s own biases.
  • Get feedback from others. It is helpful to get feedback from others on your data analysis. This can help you to identify any potential biases in your work.

Conclusion

Objectivity is an important quality for data analysts to have. By following the tips above, data analysts can help to ensure that their work is accurate and reliable.

Here are some additional tips for staying objective in data analytics:

  • Use a variety of data sources. This will help to reduce the risk of bias from any single source.
  • Be transparent about your methods. This will allow others to see how you analyzed the data and to identify any potential biases.
  • Be open to feedback. Be willing to listen to feedback from others and to make changes to your analysis if necessary.

A data analyst considers who, what, when, where, why, and how in order to achieve what goal?

To put information into context

A data analyst asks who, what, when, where, why, and how in order to put information into context.

Welcome back. In this video, we’ll explore the importance
of contextualizing data, and recognizing data
bias. Let’s get started. Data doesn’t live in a
vacuum, it needs context. Earlier, we learnt
that context is the condition in which
something exists or happens. Actions can be appropriate
in some context, but inappropriate in others, for example, yelling move, is rude one context, if your friend is standing
in front of the TV, but it’s entirely
appropriate in another, if that friend is about to get hit by a kid on a tricycle. Do you see the difference? In the world of data, numbers don’t mean
much without context. I’ll let my fellow Googler Ed, tell you a little
bit more about that As we have more and more
data available to us. We can leverage that data in increasingly
sophisticated ways, and generate more powerful
insights from it. We use data at many
different levels. Sometimes our data
is descriptive, answering questions like, how much did we spend on
travel last month? Data becomes more valuable, as we generate diagnostic
and predictive insights, like understanding why travel
spend increased last month. Data is most valuable, however, when we can generate
prescriptive insights. For example, how can we leverage data to incentivize
more efficient travel? Figuring out what data means, is just as important
as collecting it. As a data analyst, a big part of your job, is putting data into context. It’s also up to you, to remain objective and recognize all sides of an argument,
before drawing conclusions. The thing about context, is that it’s very personal. If two people curate
the same data set, and follow the same directions, there’s a chance they will end
up with different results. Why? Because there is no universal set of
contextual interpretations. Everyone approaches
it in their own way. Even if the data collection
process is correct, the analysis can still
be misinterpreted. Conclusions can be influenced by your own conscious and
subconscious biases, which are based on cultural, social and market norms. For example, if you
ask a Boston resident, which baseball team is the best, chances are, they’re going
to say Boston Red Sox. Which brings us to a major
limitation of data analytics. If the analysis is not objective, the conclusions
can be misleading. To really understand
what the data is about, you have to think
through who, what, where, when, how and why. It’s good to ask
yourself questions like, who collected the data? And what is it about? What does the data
represent in the world, and how does it
relate to other data? When, was the data collected? Data collected awhile ago may have certain limitations, given the present day situation. For example, if we collected phone numbers over the past
century, at some point, mobile phones would
have been introduced, leading to the need for an
additional phone number field. You should also think about, where, was the data collected? A lot can change across cities, states and countries, and
how was it collected. A survey might not
be as effective as an in-person
interview, for example. Of course, there’s the, why. The why can have a particularly strong
relationship with bias. Why? Because sometimes,
data is collected, or even made up, to
serve an agenda. The best thing you can do for the fairness and
accuracy of your data, is to make sure you start with an accurate representation
of the population, and collect the data in the most appropriate, and objective way. Then, you’ll have the facts so you can pass on to your team. Hopefully you now understand the importance of fair
and objective data, and how important a context is, when it comes to understanding
and interpreting it. Next up, we’ll figure out
how we can bring it to life.

Reading: The importance of context

Overview

Reading: Learning Log: Define problems and ask questions with data

Reading

Practice Quiz: Test your knowledge on structured thinking

What are the key elements of structured thinking? Select all that apply.

Fill in the blank: A scope of work is an agreed-upon _____ of the work you’re going to perform on a project.

What are some strategies to ensure your data is accurate and fair? Select all that apply.

Weekly challenge 3


Reading: Glossary: Terms and definitions

Quiz: *Weekly challenge 3*

Which of the following are examples of expressions? Select all that apply.

Which of the following are good practices when working with data in a spreadsheet? Select all that apply.

A data analyst could use spreadsheets to achieve which of the following tasks?

Which of the following statements accurately describe formulas and functions? Select all that apply.

In the function =MAX(B5:B15), what does B5:B15 represent?

What is the correct spreadsheet formula for multiplying cell K3 times cell K8?

Fill in the blank: By negatively influencing data collection, ____ can have a detrimental effect on analysis.

In data analytics, the structured thinking process includes recognizing the current problem and organizing the available information. What are the additional aspects of this process? Select all that apply.