Skip to content
Home » Google Career Certificates » Google Data Analytics Professional Certificate » Course 3: Prepare Data for Exploration » Week 4: Organizing and protecting your data

Week 4: Organizing and protecting your data

Good organization skills are a big part of most types of work, and data analytics is no different. In this part of the course, you’ll learn the best practices for organizing data and keeping it secure. You’ll also learn how analysts use file naming conventions to help them keep their work organized.

Learning Objectives

  • Explain steps that can be taken to secure data
  • Discuss the use of file-naming conventions by data analysts
  • Describe best practices for organizing data

Effectively organize data


Video: Feel confident in your data

This section of the course will focus on organizing and protecting data. The speaker mentions that keeping data organized is important for several reasons, including:

  • Making it easier to find and use
  • Avoiding mistakes during analysis
  • Protecting the data

The speaker then outlines the topics that will be covered in the next few videos:

  • Organizing data for personal and professional use
  • File naming conventions
  • Security features for spreadsheets

By the end of this section, students will be able to:

  • Organize data in a way that is efficient and effective
  • Use appropriate file naming conventions
  • Implement security features to protect their data
  • Explain these steps to stakeholders

Hey, good to have you back. Up until now, we’ve
focused on preparing your data for processing
and analysis. In these next videos, we’ll explore another big
part of that process, organizing and
protecting your data. Keeping your data organized is important for a few reasons; it makes it easier
to find and use, helps you avoid making
mistakes during your analysis and
helps to protect it. Coming up, we’ll go over the basics of organizing data for personal and professional use and file naming conventions. Then we’ll take a look at some security features
for spreadsheets. By the end of these
next few videos, you’ll be able to do all
these things and you’ll be able to explain these
steps to stakeholders, so they can feel confident that your data practices
are safe and secure. When you’re ready to get started, go ahead to the next video. There we’ll get started with organizing data for personal use.

Video: Let’s get organized

The speaker discusses several best practices for organizing data, including:

  • Naming conventions: Using logical and descriptive names for files to make them easier to find and use.
  • Foldering: Organizing files into folders to keep project-related files together in one place.
  • Archiving: Moving old projects to a separate location to create an archive and cut down on clutter.

The speaker also mentions two additional considerations when organizing data for work use:

  • Alignment with team: Aligning naming and storage practices with the team to avoid confusion and develop metadata practices.
  • Data duplication: Avoiding data duplication by storing data in a relational database.

The speaker then provides examples of data organization, including a Finances folder organized categorically and an Invoices folder organized chronologically. The speaker also discusses other ways to organize data, such as in order of importance or by location.

Finally, the speaker emphasizes the importance of organizing data early on in a project and compares unorganized data to a messy room. The speaker concludes by stating that file naming conventions carry over into databases and will be discussed in the next video.

Hey, welcome back. Whether you’re organizing
your personal data for your own use or organizing
project data for work, there are certain procedures
you want to follow to make sure your data is
easy to find and use. In this video, we’ll cover some best organization practices and also check out
some different ways project data can be organized. There are plenty of
best practices you can use when organizing data, including naming
conventions, foldering, and archiving older files. We’ve talked about
file naming before, which is also known as
naming conventions. These are consistent guidelines that describe the content, date, or version of
a file in its name. Basically, this means
you want to use logical and descriptive names for your files to make them
easier to find and use. Speaking of easily
finding things, organizing your files
into folders helps keep project-related files
together in one place. This is called foldering. For example, all the files
related to your vacation plan might go in the
Vacation2025 folder. You might then break that
folder down even further by creating subfolders like
itinerary or photos, depending on what else you’d
like to easily access. It can also be useful
to move old projects to a separate location to create an archive and cut
down on clutter. It’s so much easier
to find and use my files when I name them something meaningful
and searchable and when I organize
them into folders. It makes all my data more
accessible and useful. In addition to these
three best practices, there are two more things
you’ll want to consider when organizing
data for work use. First, the project data
you’ll be using for work could be accessed and
used by multiple people. It’s important to align your naming and
storage practices with your team to avoid any confusion. Your team might also develop metadata practices like creating a file that outlines project naming conventions
for easy reference. We’ll get to talk more
about naming conventions for work files in
more detail later. Secondly, you want to think
about how often you’re making copies of data and
storing it in different places. Most importantly, because
if data is stored in lots of different
databases or spreadsheets, it can contradict itself and
lead to mistakes later on. Also storing data in multiple places takes
up a lot of space. Relational databases
can help you avoid data duplication and store
your data more efficiently. You can use these
practices to organize data in different ways
according to your project. Let’s look at some examples
of data organization. I have some sample
project folders here, each organized in a
slightly different way. Let’s open them up and
see what they look like. We’ll start with the
high-level Finances folder. The Finances folder has been
organized categorically. There are subfolders like budget, invoices, and payroll that represent
different categories. Let’s click on “Invoices”
to see what’s in there. In the invoices folder, you can see that we
have another set of subfolders labeled by year, 2014, 2015…. Looks like these are in
chronological order. Sometimes the way files
are organized can tell us how the data within those
files is also organized. Let’s open a file to
see if that’s right. In the 2014 subfolder, there’s a file with
invoices from June. If we open it, we can see that they’ve
been organized by date, just like the folders. There’s different
ways to organize data depending on
what you need it for. The categorical organization
of the subfolders and finances made it easy for me to go straight
to the invoices, but the chronological
organization of the invoices subfolder can help us find financial data from the exact date
we’re looking for. There’s other ways to
organize data too: in order of importance
or even by location. For example, a company might use hierarchical organization so
that employee data mirrors the structure of their employee
organization. Or a company working with geographical data might choose to
organize by location. It’s a good idea to take time early on in a project
to consider what the best organization
methods will be for you and your
team to stick to. Here’s another way
to think about it. Unorganized data is
like a messy room. It’s overwhelming,
hard to find anything in, and gets worse the longer
you avoid cleaning it up. But by making sure early on you know where
to put your files, you can keep your
work data organized, easy to use, and error free. Now that you see how
important it is to keep data organized for both
personal and work use, we’ll take a closer
look at file naming conventions and how they carry
over into your databases. See you in the next video.

Reading: Organization guidelines

Reading

Video: All about file naming

  • File naming conventions are important for organizing, accessing, processing, and analyzing data.
  • When creating file naming conventions, be sure to:
    • Work out your conventions early and align them with your team.
    • Use meaningful names that reference the project name, creation date, revision version, or other useful information.
    • Keep your file names short and sweet.
    • Format dates using the international date standard (year, month, day).
    • Lead revision numbers with a zero.
    • Use hyphens, underscores, or capitalized letters instead of spaces and special characters.
    • Create a text file that lays out all of your naming conventions for each project.

By following these tips, you can create file naming conventions that are both logical and functional, which will save you time and energy in the long run.

Hey again. So you’ve heard
me mention the idea of using meaningful and logical file names to help organize your data. But using consistent
file names can also streamline or even automate
your analysis process, saving you time and
energy in the long run. When you use
consistent guidelines that describe the content, date, or version of
a file and its name, you’re using file
naming conventions. As we’ve already discovered, these file naming conventions
help us organize, access, process, and
analyze our data. So here are some general
tips on creating file naming conventions that are
both logical and functional. Here’s some quick
file naming Do’s. Work out your conventions
early to avoid having to spend time
redoing it later. Align your file naming with
your team and make sure your file names are meaningful with references to
the project name, creation date, revision version, or any other useful information needed to understand
what’s in that file. Now, there’s some other
simple things you can do to make sure your file naming
conventions are on point. First of all, you want to keep your file name short and sweet. They’re supposed to be quick reference points that
tell you what’s in a file. From earlier videos, we know
that we want to include dates and revision
numbers in our file names. I recommend formatting
it by year, month, and day because that follows the international
date standard. Different countries have
different date conventions, so keep that in mind. When you include
revision numbers in a file name, lead with a zero, so that if you run into
double digits of revisions, it’s already built
into your conventions. Another good rule
is to use hyphens, underscores, or
capitalized letters instead of using spaces. Spaces and special characters might not be recognized
by your software. Plus avoiding spaces definitely makes it easier to work in SQL. My last bit of advice: create a text file that lays out all your naming
conventions on a project. This is really helpful if
someone new joins your team, or if you just need
a quick reminder while you’re working
on something. We talked about this earlier
when we covered metadata, which is data about data. It helps explain what data there is and how it’s being organized. When you use consistent, meaningful file naming conventions
throughout your project, your data will be
easy to find and use, and you can save
yourself time, too. Up next, we’ll keep looking at spreadsheets and we’ll talk about security features and how you can use them to
protect your data now that it’s organized.
See you there.

Reading: Learning Log: Review file structure and naming conventions

Reading

Practice Quiz: Test your knowledge on how to organize data

Data analysts use guidelines to describe a file’s version, content, and date created. What are these guidelines called?

Data analysts use foldering to achieve what goals? Select all that apply.

Fill in the blank: To separate current from past work and reduce clutter, data analysts create _____. This involves moving files from completed projects to a separate location.

What is the process of structuring folders broadly at the top, then breaking down those folders into more specific topics?

Successful file naming conventions include information that’s useful when trying to locate or update a file. Which of the following is an effective file name?

Securing data


Video: Security features in spreadsheets

This video discusses the security features of spreadsheets, such as sheet protection and access control. Both Excel and Google Sheets have similar security features, such as the ability to password-protect files and worksheets, and to control who can view or edit the sheet.

Key takeaways:

  • Data security is important for protecting data from unauthorized access or corruption.
  • Spreadsheets come with built-in security features, such as sheet protection and access control.
  • Excel and Google Sheets have similar security features, but there are some slight differences due to the fact that they are located in different places.
  • As a data analyst, data security should be a priority.
  • There are some other basic best practices you can take to keep your data more secure overall, such as using file naming conventions and backing up your data regularly.

Call to action:

Make sure your data is prepared by organizing and securing it before you move on to the next step in the data analysis lifecycle.

Overview:

Spreadsheets are a powerful tool for storing and analyzing data, but they can also be a target for unauthorized access and corruption. It is important to take steps to protect your spreadsheet data by using the security features that are built into most spreadsheet programs.

Common security features in spreadsheets:

  • Sheet protection: This feature allows you to protect a worksheet or parts of a worksheet from being edited. This can be useful for preventing accidental changes to important data, or for preventing unauthorized users from editing the worksheet.
  • Password protection: This feature allows you to protect a spreadsheet with a password. This prevents users from opening or editing the spreadsheet unless they have the password.
  • User permissions: This feature allows you to control who can view and edit your spreadsheet. You can grant different levels of permission to different users, such as read-only access or full editing access.

How to use security features in spreadsheets:

Microsoft Excel:

To protect a worksheet in Excel:

  1. Go to the Review tab.
  2. In the Protect Workbook group, click Protect Sheet.
  3. In the Protect Sheet dialog box, select the options that you want to protect.
  4. Click OK.

To password protect a spreadsheet in Excel:

  1. Go to the File tab.
  2. Click Info.
  3. In the Protect Workbook group, click Encrypt with Password.
  4. In the Encrypt Document dialog box, type a password in the Password to open box.
  5. Retype the password in the Verify Password to open box.
  6. Click OK.

To set user permissions for a spreadsheet in Excel:

  1. Go to the Review tab.
  2. In the Share Workbook group, click Share.
  3. In the Share dialog box, click the Share button.
  4. In the Permissions dialog box, select the permissions that you want to grant to the user.
  5. Click OK.

Google Sheets:

To protect a worksheet in Google Sheets:

  1. Open the spreadsheet that you want to protect.
  2. Click the Data tab.
  3. Under Protection, click Protect sheets or ranges.
  4. In the Protect sheet or range dialog box, select the options that you want to protect.
  5. Click Set permissions.
  6. In the Set permissions dialog box, select the permissions that you want to grant to users.
  7. Click Done.

To password protect a spreadsheet in Google Sheets:

  1. Open the spreadsheet that you want to protect.
  2. Click the File tab.
  3. Click Share.
  4. Under Who has access, click Change to anyone with the link.
  5. Under Get link, click Restricted.
  6. In the Password dialog box, type a password in the Password box.
  7. Click Done.

Best practices for securing spreadsheet data:

  • Use strong passwords for password-protected spreadsheets.
  • Grant user permissions only to users who need them.
  • Back up your spreadsheets regularly.
  • Be careful about sharing spreadsheets with others.
  • Be aware of the risks associated with opening spreadsheets from unknown sources.

By following these best practices, you can help to protect your spreadsheet data from unauthorized access and corruption.

You’re back. Okay, now that our data’s organized
and easy to find, it’s time to start thinking
about how to protect it. The good news is that
spreadsheets come with security features
already built in. In this video, we’ll look at different spreadsheet
programs and how their security features, like sheet protections and
access control, are similar. When I say “security features,” you might be imagining ways to protect data from other people. But that’s just one kind of security. Security features can
be designed to keep unauthorized users from
viewing certain files, or just lock your
worksheets so that you don’t accidentally
break your formulas. This is called data security. Data security means
protecting data from unauthorized
access or corruption by adopting safety measures. Whatever spreadsheet
program you’re using will have similar security
measures built in. As a data analyst, you’ll run into Google
Sheets and Excel a lot. Let’s talk about what
they have in common. First, both programs
have features that let you protect your
spreadsheets or parts of your spreadsheets
from being edited, from the entire worksheet down
to single cells in a table. If you’re collaborating
with other users, you can easily lock down your formulas so that they
aren’t accidentally broken. Speaking of collaborating, Excel and Google Sheets both have access control features like password protection
and user permissions. This gives you more control over who can do what
to your spreadsheet. Because these programs are
located in different places, these features are
slightly different. For Excel spreadsheets,
you can encrypt files and worksheets
with passwords before emailing them to other
users. In Google Sheets, these settings are found
under the sharing menu, which allows you
to control who can see or edit the sheet online. Google Sheets can also
be copied so that users can work with that data
without altering the original. Tabs can also be hidden and
unhidden in Sheets and Excel, allowing you to change
what data is being viewed. But remember, even hidden tabs can be unhidden by someone else, so be sure you’re okay with those tabs still being accessible. As a data analyst, data
security will be a priority. But no matter which program you
use to create spreadsheets, there’s security
features to help you keep your work safe and secure. There are some other basic
best practices you can take to keep your data
more secure overall, which we’ll cover
later in a reading. You’ve made it to the end
of this module. Congrats. In these videos, we’ve
covered strategies for organizing data for
personal and work use, how to develop functional
file naming conventions, and some security
measures you can take advantage of
in spreadsheets. Before you move
on to the next step in the data analysis lifecycle. It’s important that you
make sure your data is prepared, and that includes
organizing and securing it. As usual after this video, you’ll have your
weekly challenge. I know you’ve got this. Then after the weekly challenge, there’s some optional
material all about connecting to the
online data community. As you start building your
career in data analytics, it’ll be really valuable
to connect with others, learn about new trends in the field and share
your own work. I think you’ll get a lot
out of those videos. That’ll help you develop a
professional online presence and find ways to communicate
with people in your field, which is key as
networking becomes more and more online and remote work opportunities
become the norm. But if you feel pretty confident about your online presence, you can move into the
course challenge instead. Good luck on this
weekly challenge, and I’ll see you soon!

Reading: Balancing security and analytics

The battle between security and data analytics

Practice Quiz: Self-Reflection: Protecting your resources

Practice Quiz: Test your knowledge on securing your data

Fill in the blank: Data security involves using _____ to protect data from unauthorized access or corruption.

When using data security measures, analysts can choose between protecting an entire spreadsheet or protecting certain cells within the spreadsheet.

What tools can data analysts use to control who can access or edit a spreadsheet? Select all that apply.

Weekly challenge 4


Reading: Glossary: Terms and definitions

Practice Quiz: *Weekly challenge 4*

What aspects of a file do file-naming conventions typically describe? Select all that apply.

A data analyst is working with a file from a customer satisfaction survey. The survey was sent to anyone who became a new customer between April and June, 2020. Which of the following are effective names for the file?

Data analysts use a process called encryption to organize folders into subfolders.

An analyst team is organizing their project files in a hierarchical fashion. They decide to implement best practices. How do they structure their folders?

A data analyst is working on a spreadsheet. The analyst decides to send out the spreadsheet with restrictions so that users cannot manipulate the data. What data practice does this describe?

What best practice for organizing data can an analyst use to structure their project files in a way that describes content, date, or version of a file in its name?

Your boss assigns you a new multi-phase project and you create a naming convention for all of your files. With this project lasting years and incorporating multiple analysts it’s crucial that you create data explaining how your naming conventions are structured. What is this data called?

A data analyst is collecting data for a local high school football team. What is an appropriate naming convention for their file?

Foldering may be used by data analysts to organize folders into what?

Data analysts use archiving to separate current from past work. What does this process involve?

Using encryption to protect data is an example of what?

A data analyst wants to share spreadsheet tab A with their team. They’re still working with tabs B and C, and they don’t want their team members to access them yet. Hiding tabs B and C will protect them from being accessed.