As you’re learning, spreadsheets, query languages, and data visualization tools are all a big part of a data analyst’s job. In this part of the course, you’ll learn more about the basic concepts involved and explore some examples of how these tools work.
Learning Objectives
- Describe spreadsheets, query languages, and data visualization tools, giving specific examples
- Demonstrate an understanding of the uses, basic features, and functions of a spreadsheet
- Explain the basic concepts involved in the use of SQL including specific examples of queries
- Identify the basic concepts involved in data visualization, giving specific examples
Mastering spreadsheet basic
Video: The ins and outs of core data tools
The speaker introduces the next few videos in the course, which will focus on the data analytics tools of spreadsheets, SQL, and data visualization.
- Spreadsheets: The speaker will break down spreadsheets to their basics and show how to use them to sort data.
- SQL: The speaker will show how to use SQL to retrieve large amounts of data quickly.
- Data visualization: The speaker will discuss the different types of data visualizations and show examples of how they can be used.
The speaker also uses a food analogy to describe the different stages of data analytics. Spreadsheets are like the appetizer, SQL is like the main course, and data visualization is like the dessert.
The speaker concludes by asking if anyone is hungry, which is a humorous way of saying that the next few videos will be informative and enjoyable.
What are core data tools?
Core data tools are the essential tools that every data analyst needs to know how to use. They include:
- Spreadsheets: Spreadsheets are a versatile tool that can be used for data cleaning, analysis, and visualization.
- SQL: SQL is a language for querying and managing databases. It is essential for data analysts who need to access and manipulate large amounts of data.
- Data visualization tools: Data visualization tools help to communicate data insights in a clear and concise way. They are essential for presenting data to stakeholders and making data-driven decisions.
How to use core data tools
The specific way you use core data tools will vary depending on your specific needs and the data you are working with. However, there are some general principles that apply to all core data tools.
- Clean your data. Before you can analyze your data, you need to make sure it is clean and free of errors. This may involve removing duplicate data, correcting typos, and filling in missing values.
- Organize your data. Once your data is clean, you need to organize it in a way that makes it easy to access and analyze. This may involve creating spreadsheets, databases, or data warehouses.
- Query your data. Once your data is organized, you can start to query it to extract the insights you need. This can be done using SQL or other query languages.
- Visualize your data. Data visualization is a powerful way to communicate data insights. There are many different data visualization tools available, so you can choose the one that best suits your needs.
Core data tools for beginners
If you are new to data analytics, there are a few core data tools that I recommend you start with:
- Microsoft Excel: Excel is a powerful spreadsheet program that is easy to learn and use. It is a good choice for beginners who need to perform basic data cleaning and analysis.
- SQL: SQL is a foundational language for data analytics. It is a good idea to learn SQL even if you are not planning to use it as your primary data analysis tool.
- Tableau: Tableau is a popular data visualization tool that is easy to use and produces high-quality visualizations. It is a good choice for beginners who want to create data visualizations that are easy to understand and share.
Advanced core data tools
Once you have mastered the basics of core data tools, you can start to explore more advanced tools. Some of the most popular advanced core data tools include:
- Python: Python is a versatile programming language that is often used for data analysis. It is a good choice for beginners who want to learn a more powerful data analysis tool.
- R: R is another popular programming language for data analysis. It is known for its statistical capabilities.
- Hadoop: Hadoop is a distributed computing framework that can be used to process large datasets. It is a good choice for businesses that need to analyze large amounts of data.
Conclusion
Core data tools are essential for data analysts. By learning how to use these tools, you can gain the skills you need to analyze data, communicate insights, and make data-driven decisions.I hope this tutorial has been helpful. If you have any questions, please feel free to ask.
Welcome back. In the
next few videos, you’ll continue to explore the data analytics tools
we discussed earlier, and you’ll get the chance to see them in action a little bit. This will give you
a clearer picture of how to use these tools. The rest of the program will build on from what
you learn here. We’ll start with a closer
look at spreadsheets. We’ll break spreadsheets
down to their basics to better understand a few of
their features and functions. You’ll also learn how
you might want to use them in your work
as a data analyst. For example, how do you sort data to make
it easier to use? We’ll find out. Next, we’ll see SQL in action. Data analysts use SQL in their
work all the time. Like when they need a
large amount of data in seconds to help answer a
quick business question. Chances are, you’re not
familiar with SQL. That’s okay. You’ll learn how using
SQL is just like ordering food at a super
speedy restaurant. Your SQL query might not be as delicious but you won’t have to wait long
to get your order. Speaking of food, what
better topic than dessert? You can think of data
visualization as the dessert to the meal
of data analytics. It’s served at the end of your analysis after
you’ve done what you need to get the right data
for a question or task. We’ve already seen that visualizations come
in a lot of forms, like graphs or charts. Just like dessert, they’re
a treat to look at. You’ll learn more about
these visual representations and see other examples
of how they might look. Then you’ll get to talk
about visualizations with other future data
analysts just like yourself. We’ll wrap things up
with an assessment, but you’ll have time to review what you’ve
learned before then. Okay, let’s keep going. By the way, is anyone
else hungry now?
Video: Columns and rows and cells, oh my!
The video introduces the basics of spreadsheets, including cells, rows, columns, attributes, observations, and sorting. It also shows how to use formulas to manipulate data.
Here is a summary of the key points:
- Cells: Cells are the basic building blocks of a spreadsheet. Each cell is identified by a unique address, which is a combination of a column letter and a row number.
- Rows: Rows are organized horizontally in a spreadsheet and are numbered.
- Columns: Columns are organized vertically in a spreadsheet and are labeled with letters.
- Attributes: Attributes are the characteristics of the data in a spreadsheet. They are typically listed in the header row.
- Observations: Observations are the rows of data in a spreadsheet. Each observation contains all of the attributes for a single entity.
- Sorting: Sorting is the process of organizing data in a specific order. You can sort data in ascending or descending order.
- Formulas: Formulas are used to manipulate data in a spreadsheet. They are written in cells and use cell references to indicate the data that they are calculating.
The video also shows how to use a formula to calculate the total number of siblings in a dataset.
Overall, the video provides a good introduction to the basics of spreadsheets. It is a good starting point for anyone who wants to learn how to use spreadsheets for data analysis.
What are columns and rows?
In data analytics, columns and rows are the two main components of a data table. Columns represent the different attributes of a data set, while rows represent the individual observations or records in the data set.
For example, a data table of customer orders might have columns for the customer’s name, address, order date, and order items. Each row in the table would represent a single order.
What are cells?
Cells are the intersections of columns and rows. They contain the actual data values for each attribute and observation.
In the example above, the cell at the intersection of the “Customer Name” column and the first row would contain the name of the first customer in the data set.
Why are columns and rows important?
Columns and rows are important for organizing and structuring data. They make it easy to identify and access specific data values, and they can help to improve the readability and understandability of data tables.
For example, if you are looking for the customer’s address in the example above, you can easily find it by looking at the “Address” column.
How to use columns and rows in data analyticsColumns and rows are used in many different data analytics tasks, such as:
- Data cleaning: Columns and rows can be used to identify and remove duplicate data, correct typos, and fill in missing values.
- Data analysis: Columns and rows can be used to perform calculations, create visualizations, and identify trends in data.
- Data modeling: Columns and rows can be used to build predictive models that can be used to make predictions about future events.
- Data visualization: Columns and rows can be used to create data visualizations that can help to communicate data insights to others.
Conclusion
Columns and rows are essential concepts in data analytics. By understanding how they work, you can gain the skills you need to organize, analyze, and visualize data.
Here are some additional tips for using columns and rows in data analytics:
- Use descriptive column names to make your data tables easier to understand.
- Use consistent formatting for your data values to make them easier to read.
- Use meaningful row names to identify the individual observations in your data set.
- Use comments to explain your data cleaning and analysis procedures.
By following these tips, you can create data tables that are clear, concise, and easy to use.
Spreadsheets are a big
part of data analytics. The sooner you make friends with spreadsheets, the better. Trust me, they’ll save
you a lot of time as a data analyst and make
your whole job easier. This spreadsheet is one example of how an organized
spreadsheet might look. In this video, we’ll
demonstrate some of the basic spreadsheet
concepts for all of you who are
new to this world. This might be a review
for some of you more experienced
folks out there, but it never hurts to
practice what you know. Plus, you might still
learn a trick or two. I showed you this image earlier. Let’s explore it further
because it’s a great example of the three main features
of a spreadsheet: cells, rows, and columns. They’ll be part of
almost everything you do in a
spreadsheet for making a simple grocery list to
analyzing a complex dataset. I use spreadsheets to
manage everything from my own personal finances to the annual homecoming party my friends and I
have every year. I’m the planner, so I use a spreadsheet to keep
things in order, making sure we have
everything we need. Speaking of keeping
things in order, columns are organized vertically in a spreadsheet and
are ordered by letter. And the rows are organized horizontally and are
ordered by number. So when you talk about
a specific cell, you name it by combining
the column letter and the row number where
the cell is located. For example, in
this spreadsheet, the word row is in cell D3. Let’s get started in
an actual spreadsheet. You can complete
all of the steps in just about any
spreadsheet program. Let’s get to know your
spreadsheet a little better now. We’ll start with some
basic operations. Keep in mind, as an analyst, you won’t always create
your own dataset. But for now, let’s do just that. I’ll click in cell A2 and
type my first name like this. Then I’ll click in cell
B2 and type my last name. Don’t worry if your name
doesn’t fit in the cell, you can always make columns
wider if you need to. All you have to do
is click and drag the right edge of the column
until your name fits. Or you can use the
text wrapping feature, which will set cells to
automatically change their height to allow the
text in the cell to fit. To use this feature, select the cells, columns, or rows with text, then use the format menu to look at the text
wrapping options. It is automatically set to allow the text to overflow
out of the cell. But you can wrap the text instead so all of
the text is visible. The clip option will
cut off the text in the cell so only the text
that fits is visible. There it is. We’ve added data. Now let’s label it. This is important
for organization. Adding labels to the top of the columns will
make it easier to reference and find data later on when you’re doing analysis. These column labels are
usually called attributes. An attribute is a
characteristic or quality of data used to label a
column in a table. More commonly, attributes are referred to as column names, column labels, headers, or the header row. Let’s add some
headers to our table. I’ll click in cell A1 and
type the words first name. In cell B1 I’ll type last name. We’ll make these attributes
bold so they stand out more. Spreadsheets can get really big, so you want to make
sure your data is clearly labeled
and easy to find. I can use my cursor to select the cells with the attributes. Then I’ll click the bold
icon to make them bold. Looking good so far. Ready to add some more data? Let’s start with
some new attributes. First, I’ll add a column for
the number of siblings by typing siblings in cell C1. Then I’ll add two
more attributes in the next two columns. Let’s go with favorite
color and favorite dessert. I’ll make them bold too. To fit the labels in the cells, I’ll adjust the size of the
columns just like before. Now, keep in mind, there are more ways to adjust the size of the
columns and rows. If you have questions
about using spreadsheets, a quick search online will usually help you
find what you need. We’ve also included
a reading with more tips and information
about spreadsheets. OK, let’s get back to it. Now, I can add my own
data to the dataset. I’ll type in how many
siblings I have and my favorite color and dessert
in the appropriate cells. Next, I’ll add data
for two more people. We now have three rows of data. In a dataset, a row is also
called an observation. An observation includes
all of the attributes for something contained
in a row of a data table. In this case, row 3
is an observation of Willa Stein because we see all of her
attributes in this row. So now we know spreadsheets let you do lots of things with data. You can store and organize data like we’ve done
in this spreadsheet. But you can go even further and recognize existing data too. Here, I’ll show you how. Let’s say we want to
organize our data by how many siblings
each person has. There’s a simple way to do that. First, we’ll need to select
all of our columns with data so that all of it
is reorganized together. Then we can go to our data menu. Here we have some options. Let’s select sort range. This will let us choose how
to organize the column. Next, we’ll choose A to Z, which will organize
our numbers in order from smallest to largest. Now, we want to watch
out for header row, which is the word siblings, the attribute for this column. We’ll check this box. This makes sure the word
siblings stays in place. Now we’re ready to sort. Voila, we just reorganized our data by sorting it from the smallest number
to the largest. As we go further, you discover lots of other ways to work with
data in a spreadsheet, including functions
and formulas. Let’s finish with a quick
example of a formula. You can think of formulas as one way of manipulating
data in a spreadsheet. Formulas are like a
calculator, but more powerful. A formula is a set of
instructions that performs a specific action using
the data in a spreadsheet. To do this, the formula uses cell references for the
values it’s calculating. Let me show you. Here we go. We’ll click in the next cell
in the siblings column. Then we’ll type an equal sign. All formulas begin
with this symbol. Next, we’ll type in the cells
we want to add together. In this case, we’ll type
in C2 plus C3 plus C4. Now we can press “Enter”. There it is. The
formula has given us the total number of siblings
represented in this dataset. We’ve just analyzed some data. We’ll want to store the
data for later use. In Google sheets,
a spreadsheet is automatically saved
in your Google Drive. For Excel and other
spreadsheets, you’ll save them as a file. Now you know some basics
for using spreadsheets. Once you’re used
to these concepts, you’ll be able to learn even more about spreadsheet tools. Feel free to re-watch this video and
practice on your own. You can even make your own
version of the spreadsheet with your own data. Bye for now.
In a table, an attribute is a characteristic or quality of data used for what purpose?
To label a column
In a table, an attribute is a characteristic or quality of data used to label a column.
Practice Quiz: Hands-On Activity: Generating a chart from a spreadsheet
Reading: More spreadsheet resources
Reading
In the spirit of lifelong learning, it is good to have resources to turn to when you want to know more about using spreadsheets. Two of the most well known and used spreadsheet platforms are Google Sheets and Microsoft Excel. Both provide free online training resources that you can access anytime you need them. Bookmark these links if you want to access them later.
Google Sheets Training and Help
Learn even more ways to move, store, and analyze your data with the Google Sheets Training and Help page, located in the Google Workspace Learning Center. This hub offers an expanded list of tips, from beginner to advanced, along with cheat sheets, templates, guides, and tutorials.
Want to learn more about Google Sheets? This online help article features a short list of the most important functions you will use, including rows, columns, cells, and functions.
Microsoft Excel for Windows Training
Get to know Excel spreadsheets a little better by visiting this free online training center. Offering everything from a quick-start guide and introduction to tutorials and templates, you will find everything you need to know, all in one place.
Practice Quiz: Test your knowledge on spreadsheet basics
In a spreadsheet, what is text wrapping used for?
To allow all of the text to fit inside a cell
In a spreadsheet, text wrapping is used to allow all of the text to fit inside a cell.
The columns in a spreadsheet are ordered by letter, and the rows are ordered by number.
True
In a spreadsheet, columns are ordered by letter and rows are ordered by number.
Fill in the blank: In a data table, a row is called an observation. An observation includes all of the _____ for what is contained in the row.
attributes
In a data table, a row is called an observation. An observation includes all of the attributes for what is contained in the row. An attribute is a quality or characteristic of data.
Structured Query Language (SQL)
Video: SQL in action
The video introduces SQL and its capabilities, comparing it to spreadsheets. SQL is a query language that can be used to store, organize, and analyze data in databases. It is more powerful than spreadsheets and can handle larger datasets.
The video also shows how to use a basic SQL query to select all of the data from a table and to filter the data based on certain conditions.
Here is a summary of the key points:
- SQL is a query language that can be used to store, organize, and analyze data in databases.
- SQL is more powerful than spreadsheets and can handle larger datasets.
- A basic SQL query has the following structure:
SELECT [columns]
FROM [table]
WHERE [condition]
- The
SELECT
clause specifies the columns that you want to retrieve from the table. - The
FROM
clause specifies the table that you want to query. - The
WHERE
clause specifies the conditions that you want to filter the data based on.
The video concludes by encouraging viewers to continue learning about SQL and to use it to analyze data themselves.
What is SQL?
SQL stands for Structured Query Language. It is a language used to manage data in relational databases. Relational databases are databases that store data in tables. Each table has a set of columns that define the data that can be stored in the table.
What can SQL do?
SQL can be used to do many things with data in relational databases, including:
- Create and delete tables
- Insert, update, and delete data from tables
- Select data from tables
- Join tables together
- Sort and filter data
- Aggregate data
- Create views
- Grant and revoke permissions
Why is SQL important for data analytics?
SQL is an essential tool for data analysts. Data analysts use SQL to access and manipulate data in relational databases. This allows them to extract insights from data that would not be possible to obtain otherwise.
How to learn SQL
There are many resources available to help you learn SQL. Here are a few suggestions:
- Online courses: There are many online courses available that teach SQL. Some of these courses are free, while others require a fee.
- Books: There are many books available that teach SQL. Some of these books are aimed at beginners, while others are more advanced.
- Tutorials: There are many tutorials available online that teach SQL. These tutorials can be a great way to learn the basics of SQL.
- Practice: The best way to learn SQL is to practice. Try to find a dataset that you are interested in and use SQL to query the data.
SQL in action
Here is an example of how SQL can be used in data analytics:Let’s say you are a data analyst for a retail company. You want to find out which products are the most popular among customers. You can use SQL to query the company’s sales database to find the products with the highest sales.
The following SQL query would return the products with the highest sales:
SELECT product_name, SUM(quantity_sold) AS total_sales
FROM sales_table
GROUP BY product_name
ORDER BY total_sales DESC
This query first selects the product name and the total sales for each product. It then groups the results by product name and orders the results by total sales in descending order. This will return the products with the highest sales at the top of the results.
This is just one example of how SQL can be used in data analytics. There are many other things that you can do with SQL, such as joining tables together, filtering data, and aggregating data.
Fill in the blank: A data analyst uses a SQL query to retrieve information from a database. They add a WHERE statement to _____ the data based on certain conditions.
filter
They add a WHERE statement to filter the data based on certain conditions.
As you might remember, earlier we touched on the
query language SQL. In this video, you’ll see SQL in action and learn what
you can do with it, with some examples
of specific queries. I guess you can call
this the SQL sequel. We’ll try to make this one
even better than the original. Remember, SQL can do lots of the same things with data
that spreadsheets can do. You can use it to store, organize and analyze your
data, among other things. But like any good sequel, it’s on a larger scale,
bigger, more action-packed. Think of it as
supersize spreadsheets. For example, you might
want to consider a spreadsheet when you
have a smaller dataset, such as one with just 100 rows. But if your data set
seems to go on forever, and your spreadsheet is
struggling to keep up, SQL would be the way to go. When you use SQL, you need a place where the SQL
language is understood. If you’ve ever gone somewhere
and not known the language, it can be challenging
to communicate. You might think
you’re asking for one thing and get something
completely different. Well, SQL knows the feeling. SQL needs a database that
will understand its language. Let’s talk. There are a number of databases out
there that use SQL. You may use several of them during your time
as a data analyst. But here’s the thing, no matter which
database you use, SQL basically works
for the same in each. For example, in SQL,
queries are universal. We’ve talked about
queries before, but it never hurts
to have a refresher. A query is a request for data or information
from a database. Here’s the structure
of a basic query. You can see that
with this query we can select specific data from a table by adding where we can filter the data based
on certain conditions. Let’s get started. We’ll open
our database and see how SQL can communicate with it
to do some simple data task. First, let’s select
our dataset. We’ll use an asterisk to select all of the
data from the table. With that simple query, the database calls up
the table we need. Magic. Let’s add Where to our query to show how that
changes what data we get. You can see the data
now only shows, movies that are in
the action genre. That’s it, a basic query in SQL. Pretty cool, huh? Soon you’ll learn about building
more complex queries. For now, though,
we can celebrate learning about the structure
of a basic SQL query, select, from, and where. As you continue the program, you have the opportunity
to use SQL yourself. I hope this video was a useful sneak peek at
what’s coming later.
Reading: SQL Guide: Getting started
Video: Angie: Everyday struggles when learning new skills
Angie is a Program Manager of Engineering at Google who is currently working on the Data Analytics certificate. She shares her experience learning SQL and how she was inspired by her parents’ experience learning English as immigrants.
Angie remembers feeling frustrated when she first started learning SQL because everyone around her seemed to be fluent. She struggled with the most basic things, like getting data out of a table or finding the average of something. She compares learning SQL to learning a new language, and says that it felt like she was at toddler level while everyone else was fluent.
Angie then shares the story of her parents’ experience learning English. She remembers watching them struggle every day to pick up a new language and do basic things like ask for help at the grocery store. She remembers calling the cable company when she was six to ask questions about the bill because her parents couldn’t. She remembers how hard her parents worked to learn English and become fluent.
Angie says that this experience inspired her to keep trying when she was learning SQL. She thought that if her parents could do it, she could too. She also realized that it’s okay to ask for help, even for the most basic things.
Angie’s story is a reminder that everyone struggles when they are learning something new. It’s important to be patient with yourself and to ask for help when you need it.
I’m Angie, I’m a Program Manager of Engineering at Google. I’m currently working on the
Data Analytics certificate. Previously, I was a researcher
in people analytics. I was also what I call
an analytical mercenary working for a lot of
different companies to help them make
sense of their data. Every time I learn a new skill, I feel like I’m learning how
to speak all over again. I remember the first
time I learned SQL, I was so frustrated because
everyone around me, it felt like they were fluent, they knew exactly
what they were doing. I remember struggling with
the most basic things, just like getting the data out
of the table or I remember somebody asked me just
to find an average of something and I kept
on getting an error. It really does feel like you’re learning a new
language and you’re at toddler level and everyone around you
is like maybe fluent. My parents immigrated to this country when they
are in their 30s. After they had learned another language and
they had to start over and learn English. I remember as a
child watching them struggle every day to
pick up a new language, to do really basic things, like ask for help at
the grocery store. I remember calling the cable
company when I was six, asking them questions about the bill because my
parents couldn’t. I remember how hard
they worked to learn this new
language and to become fluent and every
time I’m learning a new data language like SQL or R, I think about how
hard that must have been. I think if they can do
that, I can learn SQL. If they can ask for help for
the most basic of things, I can ask the Data
Analysts next to me how to write a SQL statement and how to get data out of a table. That really helped me, is just having that mindset and knowing that I
can ask for help.
Reading: Endless SQL possibilities
Practice Quiz: Test your knowledge on SQL
What does the asterisk (*) after SELECT tell the database to do in this query?
SELECT *
FROM employee
WHERE jobcode = 'FTE'
AND LastName = "James'
Select all columns from the employee table
SELECT * tells the database to select all columns from the employee table. The criteria in the WHERE clause tells the database what data in those columns the query should return.
In this query, the data analyst wants to retrieve data from which table?
SELECT *
FROM employee
WHERE jobCode = 'FTE'
AND LastName = 'James'
employee
The data analyst wants to retrieve data from the employee table.
In this query, what will be retrieved from the database?
SELECT *
FROM employee
WHERE jobcode = 'FTE'
AND LastName = 'James'
All data from the employee table, where the jobCode is FTE and the last name is James.
This query will select all data from the employee table, where the jobCode is FTE and the last name is James.
You are working with a database table that contains data about music artists. The table is named artist. You want to review all the columns in the table.
You write the SQL query below. Add a FROM clause that will retrieve the data from the artist table.
How many columns are in the artist table?
SELECT *
FROM artist
2
The clause FROM artist will retrieve the data from the artist table. The complete query is SELECT * FROM artist. The FROM clause specifies which database table to select data from. There are two columns in the artist table.
You are working with a database table that contains data about music albums. You are only interested in data related to the album with ID number 277. The album IDs are listed in the album_id column from the album table.
You write the SQL query below. Add a WHERE clause that will return only data about the album with ID number 277.
What is the name of the album with ID number 277?
SELECT *
FROM album
WHERE album_id = '277'
Bach: Goldberg Variations
The clause WHERE album_id = 277 will return only data about the album with ID number 277. The complete query is SELECT * FROM album WHERE album_id = 277. The WHERE clause filters results that meet certain conditions. The WHERE clause includes the name of the column, an equals sign, and the value(s) in the column to include. The name of the album with ID number 277 is Bach: Goldberg Variations.
Data visualization
Video: Becoming a data viz whiz
The video discusses the importance of data visualization in data analysis. It is a way to make data easier to understand and more visually appealing. Data visualization can be used to communicate findings to stakeholders quickly and effectively.
The video also gives the example of Florence Nightingale, who used data visualization to convince hospital administrators to focus on preventable conditions during the Crimean War. This shows how data visualization can be used to make a real difference in the world.
The video then shows how to create a simple bar graph in a spreadsheet. This is just a basic example, but it shows how easy it is to create visualizations using spreadsheet software.
The video concludes by encouraging viewers to learn more about data visualization and to use it to become more effective data analysts.
Here are the key points from the video:
- Data visualization is the graphical representation of information.
- Data visualization is an important tool for data analysts because it can make data easier to understand and more visually appealing.
- Data visualization can be used to communicate findings to stakeholders quickly and effectively.
- Florence Nightingale used data visualization to convince hospital administrators to focus on preventable conditions during the Crimean War.
- It is easy to create visualizations using spreadsheet software.
- Data analysts should learn more about data visualization and use it to become more effective.
1. Understand the basics of data visualization
The first step to becoming a data viz whiz is to understand the basics of data visualization. This includes understanding the different types of data visualizations, the principles of good design, and the different tools that can be used to create data visualizations.
2. Practice creating data visualizations
The best way to learn data visualization is by practicing. There are many online resources that offer tutorials and exercises on data visualization. You can also find data visualization competitions and challenges that you can participate in.
3. Get feedback on your work
Once you have created some data visualizations, it is important to get feedback on your work. This feedback can help you identify areas where your visualizations can be improved. You can get feedback from friends, colleagues, or mentors.
4. Stay up-to-date on the latest trends
The field of data visualization is constantly evolving. It is important to stay up-to-date on the latest trends so that you can create the most effective visualizations. You can do this by reading blogs, attending conferences, and taking online courses.
5. Build a portfolio
A portfolio is a great way to showcase your data visualization skills. It can help you get a job in data analytics or freelancing gigs. When creating your portfolio, be sure to include a variety of visualizations that demonstrate your skills.
6. Never stop learning
The field of data visualization is constantly evolving, so it is important to never stop learning. There are always new techniques and tools to learn. By staying up-to-date on the latest trends, you can become a data viz whiz and create the most effective visualizations.
Here are some additional tips for becoming a data viz whiz:
- Choose the right tool for the job. There are many different data visualization tools available, so it is important to choose the right one for the task at hand. Consider the type of data you are working with, the audience you are creating the visualization for, and your own skill level.
- Keep your audience in mind. When creating a data visualization, always keep your audience in mind. What are they trying to learn from the visualization? How can you make the visualization as clear and concise as possible?
- Use color effectively. Color can be a powerful tool in data visualization. However, it is important to use color effectively. Use colors that are easy to distinguish and that make sense for the data you are visualizing.
- Tell a story. A good data visualization should tell a story. The visualization should be more than just a collection of numbers and charts. It should help the viewer understand the data and draw conclusions.
With hard work and dedication, you can become a data viz whiz and create powerful visualizations that communicate data effectively.
What are some reasons why a data analyst might use data visualizations? Select all that apply.
To create interesting graphs
To explain complex data quickly
To reinforce data analysis
Data analysts use data visualizations to explain complex data quickly, reinforce data analysis, and create interesting graphs and charts.
Your data analysis
toolbox is getting full. Learning about both
Spreadsheets and SQL will get you far in the
world of data analysis. There’s more to learn, of course, and lots more tools
you’ll be able to use, but your future is
looking bright. It’s about to look even brighter, because we’re here to talk
more about data visualization. I’ll tell you a little
more about the role of data visualization tools
and data analytics, and give you a chance to see those tools in action
later in this video. You might remember that
data visualization is the graphical
representation of information. For tons of data analysts, it’s the most exciting part of their job because they get to see their hard work pay off
with something interesting. Not to mention that
data visualization is beautiful and useful. I was floored when
I got to Google and started to get a
quarterly data report in my e-mail and had a big slide deck where people contributed
their visualizations. It was definitely a
source of light as I started to build my
own visualizations. If you’re not
impressed by my story, let me tell you about
Florence Nightingale. Does that name ring a bell? She is responsible for much of the philosophy of modern nursing, and believe it or not, she was also a data analyst. During the Crimean
War in the 1850s, thousands of soldiers
were dying every day, Nightingale wanted to find a way to reduce the number of deaths. After examining the data, she found that the majority of soldiers were dying from
preventable conditions. To convince hospital
administrators that they needed to focus
on these conditions, she created a chart showing the number of deaths
over several months. The much larger blue sections in the visualization represent
the preventable deaths. Her work directly led to major
changes in patient care. She did all of this over 150 years ago without a computer. One of the main reasons
Nightingale created this visualization was to make the data easier to
digest for her audience. She felt she’d be more successful convincing
the stakeholders using visuals instead of
just words and numbers. She was right, tables
filled with data, while necessary for analysis, just aren’t able to show
trends and patterns as quickly and clearly as
visualizations can. Imagine, you receive
an assignment that needs to be completed
the same day. You gather the data
you need in a table, could you explain your
findings using the table? Yes, you probably could, but a better idea will be to use a visualization like
this bar graph. Something like this makes it much easier for you to
explain quickly, and you’ve got the benefit of a cool graphic to
backup your analysis. As a data analyst, you’ll want to create
visualizations that make the data easy to understand, and interesting to look
at, so show it off. Stakeholders may not have much time to devote to the data, your job will be to make
their time worthwhile. Let’s go back to
that data table we created earlier in the course. If you created your
own for practice, you can open it up now
or try this out later. Here’s the data we added before. Let’s create a
visualization of the data by inserting a
chart, a bar graph. You can see that the
spreadsheet visualized the data from our table in a way that made the most sense. It created a bar graph or column chart to compare the
ages of each person by name, but you might have
figured that out already. That’s the beauty
of visualization, it shows data analysis
quickly and clearly. We can use chart editor
to adjust the chart. Different spreadsheet
programs might have different ways to do this, but they all have
visualization functions and ways to edit
those visualizations. For now, let’s just look
at the suggested charts. We can make the bars go
horizontally using a bar chart. That looks great, so let’s
close the Chart editor. There are lots of
options to look at, but we’ll keep it basic for now. Feel free to try
other visualizations if you practice later. Now, we can adjust
our chart to make our whole spreadsheet look
clean and professional. Excellent. I hope
you learn to love data visualization
as much as I do. Maybe you’ll become a data
visualization pioneer, just like Florence Nightingale. As a budding data analysts, you started to fill
your utility belt with valuable tools that you’ll use throughout the
rest of the program. Having spreadsheets, SQL
and data visualization know-how will help make
you an ace data detective. You’ll be able to use
these tools throughout the data analytics process
as you move forward. Coming up next, you complete a few activities to wrap up
this part of the program. You’ll also complete
an assessment to check your understanding
of all that you learn. This is a great opportunity to think about some
of the areas that you’ll continue to explore in this course and
in your career. As always, feel free to
review the videos and readings to help remind you
of certain topics and ideas, even if you already
feel prepared. You’re just a few steps
away from the next course, that’s great
progress. Keep it up.
Reading: Planning a data visualization
Video: Lilah: The power of a visualization
Lilah Jones, a member of the cloud team at Google, discusses the importance of data visualization. She compares data visualization to telling a story with pictures. She says that data visualization can be used to make data more understandable and to support decision-making.
Jones gives the example of a budgeting software that she used. The software provided interactive visualizations that changed depending on her input. This helped her to better understand her spending habits and to make better budgeting decisions.
Jones concludes by saying that data visualization is like having the answer sheet for a test. It helps us to make good decisions because it is backed up by data.
[MUSIC] My name is Lilah Jones, and
I am a part of our cloud team. I get a chance to lead a team of amazing
individuals that are focused on helping customers get to the cloud. Data visualizations,
that’s a long word and that can also make your eyes glaze over. But I wonder if, when you were little and
you were with your parents, maybe they had a bedtime routine or
maybe you have children, you’re doing a bedtime routine with them. You very rarely are going to come to
those children with a bunch of facts and figures before they go to bed. But I bet you probably are telling them
a story, you’re showing them pictures, I know I always loved comic books,
pictures tell a story. Data visualizations are pictures,
they are a wonderful way to take very basic ideas around data and
data points and make them come alive. You can do all different types of
combinations of visualizations, but the ones that are interactive,
wow, those are huge. Can you imagine being executive of an
organization and trying to figure out how
should we open up another site in Bangkok? Does that make sense and
us being able to walk in and saying, here’s why it makes sense in having
great data visualizations to support all of our points of view,
makes it a no brainer. Interestingly enough, I do recall
the first time I came across a super amazing visualization,
it was in my personal life. I switched my budgeting software
from one provider to another, and the provider that I switched to was really
focused on every dollar has a job and making sure you’re budgeting,
every single dollar. They gave visualizations that change,
depending on what input you would add to it and it really just changed my
entire perspective, the entire thing. So having the data is like having
the answer sheet for a test, it really just lets you know that you’re
going to make good decisions because it’s backed up by data.
Practice Quiz: Test your knowledge on visualizing data
Fill in the blank: A data visualization is the _____ representation of information.
graphical
A data visualization is the graphical representation of information.
When would a pie chart be an effective visualization?
When showing a class broken down by age
A pie chart shows how a whole is broken down into parts and is an effective visualization for a class broken down by age.
What are the key benefits of data visualizations? Select all that apply.
- They can illustrate relationships between data points
- They can help stakeholders understand complex data more quickly
- They can clearly demonstrate patterns and trends
Data visualizations can clearly demonstrate patterns and trends, help stakeholders understand complex data more quickly, and illustrate relationships between data points.
*Weekly challenge 4*
Reading: Glossary: Terms and definitions
Data Analytics
A
Analytical skills: Qualities and characteristics associated with using facts to solve problems
Analytical thinking: The process of identifying and defining a problem, then solving it by using
data in an organized, step-by-step manner
Attribute: A characteristic or quality of data used to label a column in a table
B
C
Context: The condition in which something exists or happens
D
Data: A collection of facts
Data analysis: The collection, transformation, and organization of data in order to draw
conclusions, make predictions, and drive informed decision-making
Data analyst: Someone who collects, transforms, and organizes data in order to draw
conclusions, make predictions, and drive informed decision-making
Data analytics: The science of data
Data design: How information is organized
Data-driven decision-making: Using facts to guide business strategy
Data ecosystem: The various elements that interact with one another in order to produce,
manage, store, organize, analyze, and share data
Data science: A field of study that uses raw data to create new ways of modeling and
understanding the unknown
Data strategy: The management of the people, processes, and tools used in data analysis
Data visualization: The graphical representation of data
Database: A collection of data stored in a computer system
Dataset: A collection of data that can be manipulated or analyzed as one unit
E
F
Formula: A set of instructions used to perform a calculation using the data in a spreadsheet
Function: A preset command that automatically performs a specified process or task using the
data in a spreadsheet
G
Gap analysis: A method for examining and evaluating the current state of a process in order to
identify opportunities for improvement in the future
H
I
J
K
L
M
N
O
Observation: The attributes that describe a piece of data contained in a row of a table
P
Q
Query: A request for data or information from a database
Query language: A computer programming language used to communicate with a database
R
Root cause: The reason why a problem occurs
S
Stakeholders: People who invest time and resources into a project and are interested in its
outcome
T
Technical mindset: The ability to break things down into smaller steps or pieces and work with
them in an orderly and logical way
U
V
Visualization: (Refer to data visualization)
W
X
Y
Z
Quiz: *Weekly challenge 4*
In the following spreadsheet, the column labels in row 1 are called what?
Question 1
In the following spreadsheet, the column labels in row 1 are called what?
A | B | C | D | |
1 | Rank | Name | Population | County |
2 | 1 | Charlotte | 885,708 | Mecklenburg |
3 | 2 | Raleigh | 474,069 | Wake (seat), Durham |
4 | 3 | Greensboro | 296,710 | Guilford |
5 | 4 | Durham | 278,993 | Durham (seat), Wake, Orange |
6 | 5 | Winston-Salem | 247,945 | Forsyth |
7 | 6 | Fayetteville | 211,657 | Cumberland |
8 | 7 | Cary | 170,282 | Wake, Chatham |
9 | 8 | Wilmington | 123,784 | New Hanover |
10 | 9 | High Point | 112,791 | Guilford, Randolph, Davidson, Forsyth |
11 | 10 | Concord | 96,341 | Cabarrus |
Attributes
The column labels in row 1 are attributes that refer to the data in the column. An attribute is a characteristic or quality of data used to label a column in a table.
In the following spreadsheet, the observation of Greensboro describes all of the data in row 4.
A | B | C | D | |
1 | Rank | Name | Population | County |
2 | 1 | Charlotte | 885,708 | Mecklenburg |
3 | 2 | Raleigh | 474,069 | Wake (seat), Durham |
4 | 3 | Greensboro | 296,710 | Guilford |
5 | 4 | Durham | 278,993 | Durham (seat), Wake, Orange |
6 | 5 | Winston-Salem | 247,945 | Forsyth |
7 | 6 | Fayetteville | 211,657 | Cumberland |
8 | 7 | Cary | 170,282 | Wake, Chatham |
9 | 8 | Wilmington | 123,784 | New Hanover |
10 | 9 | High Point | 112,791 | Guilford, Randolph, Davidson, Forsyth |
11 | 10 | Concord | 96,341 | Cabarrus |
The observation of Greensboro describes all of the data in row 4. An observation is all of the attributes for something contained in a row of a data table.
Fill in the blank: In the following spreadsheet, the _____ feature was used to alphabetize the city names in column B.
A | B | C | D | |
1 | Rank | Name | Population | County |
2 | 7 | Cary | 170,282 | Wake, Chatham |
3 | 1 | Charlotte | 885,708 | Mecklenburg |
4 | 10 | Concord | 96,341 | Cabarrus |
5 | 4 | Durham | 278,993 | Durham (seat), Wake, Orange |
6 | 6 | Fayetteville | 211,657 | Cumberland |
7 | 3 | Greensboro | 296,710 | Guilford |
8 | 9 | High Point | 112,791 | Guilford, Randolph, Davidson, Forsyth |
9 | 2 | Raleigh | 474,069 | Wake (seat), Durham |
10 | 8 | Wilmington | 123,784 | New Hanover |
11 | 5 | Winston-Salem | 247,945 | Forsyth |
Sort range was used to alphabetize the city names in column B. Sorting a range of data from A to Z helps data analysts organize and find data more quickly.
A data analyst types =POPULATION(C2:C11) to find the average population of the cities in this spreadsheet. However, they realize they used the wrong formula. What syntax will correct this function?
A | B | C | D | |
1 | Rank | Name | Population | County |
2 | 1 | Charlotte | 885,708 | Mecklenburg |
3 | 2 | Raleigh | 474,069 | Wake (seat), Durham |
4 | 3 | Greensboro | 296,710 | Guilford |
5 | 4 | Durham | 278,993 | Durham (seat), Wake, Orange |
6 | 5 | Winston-Salem | 247,945 | Forsyth |
7 | 6 | Fayetteville | 211,657 | Cumberland |
8 | 7 | Cary | 170,282 | Wake, Chatham |
9 | 8 | Wilmington | 123,784 | New Hanover |
10 | 9 | High Point | 112,791 | Guilford, Randolph, Davidson, Forsyth |
11 | 10 | Concord | 96,341 | Cabarrus |
=AVERAGE(C2:C11)
The correct AVERAGE function syntax is =AVERAGE(C2:C11). AVERAGE returns an average of values from a selected range. C2:C11 is the specified range.
You are working with a database table named playlist that contains data about playlists for different types of digital media. You want to review all the columns in the table.
You write the SQL query below. Add a FROM clause that will retrieve the data from the playlist table.
What is the playlist with ID number 3?
SELECT *
FROM playlist
TV Shows
The clause FROM playlist will retrieve the data from the playlist table. The complete query is SELECT * FROM playlist. The FROM clause specifies which database table to select data from. The playlist with ID number 3 is TV Shows.
You are working with a database table that contains invoice data. The customer_id column lists the ID number for each customer. You are interested in invoice data for the customer with ID number 28.
You write the SQL query below. Add a WHERE clause that will return only data about the customer with ID number 28.
What is the billing city for the customer with ID number 28?
SELECT
*
FROM
invoice
WHERE
customer_id = '28'
Salt Lake City
The clause WHERE customer_id = 28 will return only data about the customer with ID number 28. The complete query is SELECT * FROM invoice WHERE customer_id = 28. The WHERE clause filters results that meet certain conditions. The WHERE clause includes the name of the column, an equals sign, and the value(s) in the column to include. The billing city for the customer with ID number 28 is Salt Lake City.
A data analyst creates the following visualization to clearly demonstrate how much more populous Charlotte is than the next-largest North Carolina city, Raleigh. It’s called a line chart.
False
This is a column chart. A column chart is effective at demonstrating the differences between several items in a specific range of values.
A data analyst wants to demonstrate how the population in Charlotte has increased over time. They create this data visualization. This is an example of an area chart.
False
This is a line chart. Line charts are effective for illustrating trends and patterns, such as how population changes over time.