You’ll complete an end-of-course project by creating a pipeline process to deliver data to a target table and developing reports based on project needs. You’ll also ensure that the pipeline is performing correctly and that there are built-in defenses against data quality issues.
Learning Objectives
- Identify business needs to determine a design for your portfolio project’s data pipeline.
- Analyze how database systems are designed, how to build BI tools such as pipelines and ETL systems, and how to optimize them to maximize performance to determine the most optimal data pipeline process.
- Develop a data pipeline to deliver necessary data to a target table.
Apply your skills to a workplace scenario
Video: Welcome to module 4
The article is about how to use your end-of-course project in your job search. It recommends using it to create a portfolio and resume that exhibit your skills, and to practice your interview skills. It also highlights the importance of experiential learning and how this project can help you discover how organizations are using BI every day and practice your new skills.
Hi, it’s great to be with you again! I’m Anita, a Senior Business Intelligence
Analyst at Google, and I’m back to discuss the next part of your end-of-course project and how you can use it in your job search. Are you ready to start job searching in the BI field? The first step is to create a portfolio and resume that exhibit your skills. Your portfolio will be a collection of materials that you can use to showcase those skills and your approach to solving BI problems. One way you can build up your portfolio is by completing projects that demonstrate what you’ve learned – like this program’s end-of-course project! This end-of-course project is also a really valuable opportunity to develop your interview skills! As potential employers
assess you as a candidate, they might ask for specific examples of how you have tackled
challenges in the past. You can use your portfolio as a way to discuss real problems
that you have solved. Additionally, some employers might ask you to complete another case study for an interview. Getting some practice creating your own case studies means that you’ll be that much more
prepared for those interviews. You have already explored
experiential learning, or the idea of
understanding through doing. This end-of-course project is also a great opportunity to really discover how organizations are using BI every day, practice your new skills, and really show off your BI knowledge. In order to complete the
end-of-course project, you’ll be presented with more details about the business case. Then, you’ll use the key BI documents you completed to create a pipeline system that delivers data to reporting tables. Later on, you can use these tables to design dashboards to share
insights with stakeholders. By the time you complete this project, you’ll have a finished case study you can add to your portfolio. You’ll also have documentation of the steps you took along the way, which you can use to explain your work to future hiring managers. At this point, you have probably already completed one part of the
end-of-course project. Now, you’re almost finished with the next course of this program, which means you know everything you need to tackle the next phase. Ready? Then let’s get started!
Video: Continue your end-of-course project
The next part of the end-of-course project will focus on using a database to create BI tools that automate key processes such as moving and transforming data before reading it into target tables for reporting. This will enable the team to focus on other aspects of their jobs and get updated insights from the data. This is a great opportunity to demonstrate to potential employers that you can develop and improve BI tools.
In this course, you’ve
been learning about how database systems are designed, how to build BI tools such
as pipelines and ETL systems, and how to optimize them
to maximize performance. Now, it’s time for an exciting next step: putting all of this to work
for your end-of-course project. In the previous course, you were introduced to
the project scenario and important details about the case. Along the way, you developed
Stakeholder Requirements, Project Requirements,
and Strategy Documents. Now that you have these important elements
of the project laid out, it’s time to start
thinking about the data. In this part of the course, you’ll begin working with a database in order to create BI tools
that automate key processes such as moving and transforming data before reading it into your
target tables for reporting. Coming up, you’re going to
access the project data, explore the design pattern
for your database system, observe your team’s work
to determine their needs, and use required metrics and facts to build a pipeline
that meets those needs. These processes will enable your team to focus their time on other aspects of their day-to-day jobs, and will move and transform your data for immediate use in
reports and dashboards. These reports and dashboards will give your stakeholders
updated insights and empower them to get their
own answers from the data. This part of the end-of-course project is a great opportunity to
demonstrate to potential employers that you can do exactly that. And remember: developing these tools is an iterative process, so you can continue to
build and improve them as you have new ideas or learn new things.
Reading: Explore Course 2 end-of-course project scenarios
Reading
Overview
When you approach a project using structured thinking, you will often find that there are specific steps you need to complete in a specific order. The end-of-course projects in the Google Business Intelligence certificate were designed with this in mind. The challenges presented in each course represent a single milestone within an entire project, based on the skills and concepts learned in that course.
The certificate program allows you to choose from different workplace scenarios to complete the end-of-course projects: the Cyclistic bike share company or Google Fiber. Each scenario offers you an opportunity to refine your skills and create artifacts to share on the job market in an online portfolio.
You will be practicing similar skills regardless of which scenario you choose, but you must complete at least one end-of-course project for each course to earn your Google Business Intelligence certificate. To have a cohesive experience, it is recommended that you choose the same scenario for each end-of-course project. For example, if you chose the Cyclistic scenario to complete in Course 1, we recommend completing this same scenario in Course 2 and 3 as well. However, if you are interested in more than one workplace scenario or would like more of a challenge, you are welcome to do more than one end-of-course project. Completing multiple projects offers you additional practice and examples you can share with prospective employers.
Course 2 end-of-course project scenarios
Cyclistic bike-share
Background:
In this fictitious workplace scenario, the imaginary company Cyclistic has partnered with the city of New York to provide shared bikes. Currently, there are bike stations located throughout Manhattan and neighboring boroughs. Customers are able to rent bikes for easy travel among stations at these locations.
Scenario:
You are a newly hired BI professional at Cyclistic. The company’s Customer Growth Team is creating a business plan for next year. They want to understand how their customers are using their bikes; their top priority is identifying customer demand at different station locations. Previously, you gathered information from your meeting notes and completed important project planning documents. Now you are ready for the next part of your project!
Course 2 challenge:
- Use project planning documents to identify key metrics and dashboard requirements
- Observe stakeholders in action to better understand how they use data
- Gather and combine necessary data
- Design reporting tables that can be uploaded to Tableau to create the final dashboard
Note: The story, as well as all names, characters, and incidents portrayed, are fictitious. No identification with actual people (living or deceased) is intended or should be inferred. The data shared in this project has been created for pedagogical purposes.
Google Fiber
Background:
Google Fiber provides people and businesses with fiber optic internet. Currently, the customer service team working in their call centers answers calls from customers in their established service areas. In this fictional scenario, the team is interested in exploring trends in repeat calls to reduce the number of times customers have to call in order for an issue to be resolved.
Scenario:
You are currently interviewing for a BI position on the Google Fiber call center team. As part of the interview process, they ask you to develop a dashboard tool that allows them to explore trends in repeat calls. The team needs to understand how often customers call customer support after their first inquiry. This will help leadership understand how effectively the team can answer customer questions the first time. Previously, you gathered information from your meeting notes and completed important project planning documents. Now you’re ready for the next part of your project!
Course 2 challenge:
- Use project planning documents to identify key metrics and dashboard requirements
- Consider best tools to execute your project
- Gather and combine necessary data
- Design reporting tables that can be uploaded to Tableau to create the final dashboard
Key Takeaways
In Course 2, The Path to Insights: Data Models and Pipelines, you focused on understanding how data is stored, transformed, and delivered in a BI environment.
Course 2 skills:
- Combine and transform data
- Identify key metrics
- Create target tables
- Practice working with BI tools
Course 2 end-of-course project deliverables:
- The necessary target tables
Now that you have completed this step of your project and developed the target tables, you are ready to work on your final dashboard in the next course!
Cyclistic scenario
Reading: Course 2 workplace scenario overview: Cyclistic
Reading
Previously, you started working with a fictional bike-share company, Cyclistic, to provide their team with key business intelligence insights. At the end of the last course, you consulted with stakeholders to develop project planning documents that establish their needs and expectations. The strategy and planning documents are key to helping you understand important details about this project, so you’ll want to make sure you have completed this step before moving on. Refer to the exemplar documents here: Activity Exemplar: Complete the business intelligence project documents for Cyclistic.
Use these exemplars to ensure that your own project planning documents are completed with enough detail to move on to the next part of the project.
Coming up, you are going to build on previous work to combine data from the tables you received for this project into one reporting table you will use to develop a dashboard you can share with stakeholders. The activities will guide you through uploading the data into your own project space, using SQL code in Dataflow or BigQuery, observing how stakeholders interact with data, and finalizing a reporting table to be used for the dashboard. There will be an exemplar project for you to review so you can check your work and continue to iterate on your own project.
This next step in your project will provide more opportunities to create additional workplace samples you can share with potential employers to demonstrate your BI skills and knowledge. This is also a great opportunity to practice the skills you have been learning so far.
Key takeaways
The end-of-course project is designed for you to practice and apply course skills in a fictional workplace scenario. By completing each course’s end-of-course project, you will have work examples that will enhance your portfolio and showcase your skills for future employers.
Reading: Cyclistic datasets
Reading
Reading: Observe the Cyclistic team in action
Reading
Understanding your stakeholders and how they use the data is key to developing business intelligence solutions that will address their specific needs. In addition to meeting with stakeholders to discuss their project requirements, it can also be useful to observe the team at work and identify any patterns or frequently asked questions.
In this reading, you’ll discover some of the ways that the Cyclistic team uses the data. This can inform the BI solution you develop for them and help you design the final reporting dashboard to be specific to their needs and useful to the whole team. As a BI professional, recognizing a team’s needs and customizing their systems can support their work as they make key business decisions.
The team at work
As you learned during your previous meeting with Cyclistic, the product development team has begun planning for the next year of Cyclistic’s bike-sharing program. Cyclistic’s Customer Growth Team is creating a business plan for next year. The team wants to understand how their customers are using their bikes; their top priority is identifying customer demand at different station locations. The Cyclistic team posed an important primary question:
- How can we apply customer usage insights to inform new station growth?
Answering these questions starts with the data from the Cyclistic bikes themselves, which the team has provided you, and the reporting dashboard the team uses to gain insights. In addition to the explicit requests the stakeholders made, you realize a few key things about the team’s current processes.
First, you realize that there are stakeholders from a variety of different departments accessing and using this data with different levels of technical expertise. There are stakeholders from these teams:
- Product development
- Customer data
- Engineering
- Data analytics
- Data warehousing
- API
- IT
- Cyclistic executive
- Project management
For example, you realize that Earnest Cox, the VP of product development, is often requesting high-level insights into the data and rarely needs detailed overviews of the data. Alternatively, Tessa Blackwell from the data analytics team does explore the data in-depth and spends a lot more time reviewing the dashboard views. As you develop your reporting tools, you will want to find a way to answer both of these stakeholders’ needs.
Additionally, one of your coworkers finds out you’re working on this project and shares a dataset they created recently for a project of their own that they think might help you: NYC zip codes. This dataset provides the zip codes for the different neighborhoods and boroughs in New York City; this will let you compare the bike data to the weather data more easily since you will be able to match the locations more accurately. It will also help you develop your map visualization later on.
Key takeaways
As you prepare to create the pipeline system that will deliver data to your reporting tables and eventually your dashboard, recognizing the different kinds of users, their specific needs and questions, and how they are currently using the data can help guide your development process. In this case, observing the Cyclistic team in action revealed that there are users who need different levels of detail and have different technical abilities. Additionally, you gained a useful tool from a colleague that will help you explore multiple datasets for this project. You can use these discoveries to design a BI solution that really addresses this organization’s unique needs–and demonstrate your skill and flexibility to future employers!
Practice Quiz: Activity: Create your target table for Cyclistic
Reading
Activity Overview
In this activity, you will use your knowledge of SQL and potentially Google Dataflow to combine and move the key datasets you identified for the Cyclistic project into a target table. This represents the extraction phase of an ETL pipeline, when data is pulled from different sources and moved to its destination. You will use the table you create in this activity to develop the final dashboard for stakeholders. As you complete this activity, remember to refer to the previous work you did when completing the business intelligence project documents for Cyclistic for details about the Cyclistic project, as well as the activity to create a target table in BigQuery for a refresher on target tables.
Be sure to complete this activity before moving on. The next course item will provide you with a completed exemplar to compare to your own work. You will not be able to access the exemplar until you have completed this activity.
Scenario
Review the following scenario. Then, complete the activity. As a reminder, the end-of-course project activities are more open to your personal interpretation than other activities in this program. This is to give you an opportunity to practice the skills you have been learning in your own way. If you need help or feel stuck, you can always discuss your work with other learners in the discussion forums or review the exemplar to help guide your process.
The product development team at Cyclistic has begun developing their business plan for next year. In order to build a better Cyclistic, the team needs to understand how customers are currently using the bikes, how location and other factors impact demand, and what stations get the most traffic. The Cyclistic team has a few goals:
- Understand current customers needs, what makes a successful product, and how new stations might alleviate demand in different geographical areas
- Understand current usage of bikes at different locations
- Apply customer usage insights to inform new station growth
- Understand how different users (subscribers and non-subscribers) use the bikes
You met with stakeholders to complete project planning documents and uploaded the necessary tables into your BigQuery project space.
Instructions
Follow the instructions and answer the following question to complete the activity. Then, go to the next course item to compare your work to a completed exemplar.
Step 1: Log into your GCP tool
To begin this activity, log into your Google Cloud account and navigate to the BigQuery console. You can complete this activity using the BigQuery Sandbox, which does not require a Google Cloud billing account. You can learn more about enabling the Sandbox from the BigQuery help guide. You can also use Dataflow to execute SQL code as a Job by navigating to the Dataflow console instead; this will require you to have a Google Cloud account. Both tools are useful for this project, so choose the tool you are more interested in working with for this project.
Step 2: Querying your data
For this step, keep in mind the key metrics you and your stakeholders have identified, their business questions, and what data you’ll need to develop the final dashboard. Previously, you explored the different public datasets your stakeholders provided and uploaded the zip code table your colleague shared with you. For the final dashboard, you will need to create two target tables: a table to capture the entire year and a table that focuses on summer trends. Here is an example of a query to capture a table with data from the entire year:
SELECT
TRI.usertype,
ZIPSTART.zip_code AS zip_code_start,
ZIPSTARTNAME.borough borough_start,
ZIPSTARTNAME.neighborhood AS neighborhood_start,
ZIPEND.zip_code AS zip_code_end,
ZIPENDNAME.borough borough_end,
ZIPENDNAME.neighborhood AS neighborhood_end,
-- Since this is a fictional dashboard, you can add 5 years to make it look recent
DATE_ADD(DATE(TRI.starttime), INTERVAL 5 YEAR) AS start_day,
DATE_ADD(DATE(TRI.stoptime), INTERVAL 5 YEAR) AS stop_day,
WEA.temp AS day_mean_temperature, -- Mean temp
WEA.wdsp AS day_mean_wind_speed, -- Mean wind speed
WEA.prcp day_total_precipitation, -- Total precipitation
-- Group trips into 10 minute intervals to reduces the number of rows
ROUND(CAST(TRI.tripduration / 60 AS INT64), -1) AS trip_minutes,
COUNT(TRI.bikeid) AS trip_count
FROM
`bigquery-public-data.new_york_citibike.citibike_trips` AS TRI
INNER JOIN
`bigquery-public-data.geo_us_boundaries.zip_codes` ZIPSTART
ON ST_WITHIN(
ST_GEOGPOINT(TRI.start_station_longitude, TRI.start_station_latitude),
ZIPSTART.zip_code_geom)
INNER JOIN
`bigquery-public-data.geo_us_boundaries.zip_codes` ZIPEND
ON ST_WITHIN(
ST_GEOGPOINT(TRI.end_station_longitude, TRI.end_station_latitude),
ZIPEND.zip_code_geom)
INNER JOIN
`bigquery-public-data.noaa_gsod.gsod20*` AS WEA
ON PARSE_DATE("%Y%m%d", CONCAT(WEA.year, WEA.mo, WEA.da)) = DATE(TRI.starttime)
INNER JOIN
-- Note! Add your zip code table name, enclosed in backticks: `example_table`
`(insert your table name) zipcodes` AS ZIPSTARTNAME
ON ZIPSTART.zip_code = CAST(ZIPSTARTNAME.zip AS STRING)
INNER JOIN
-- Note! Add your zipcode table name, enclosed in backticks: `example_table`
`(insert your table name) zipcodes` AS ZIPENDNAME
ON ZIPEND.zip_code = CAST(ZIPENDNAME.zip AS STRING)
WHERE
-- This takes the weather data from one weather station
WEA.wban = '94728' -- NEW YORK CENTRAL PARK
-- Use data from 2014 and 2015
AND EXTRACT(YEAR FROM DATE(TRI.starttime)) BETWEEN 2014 AND 2015
GROUP BY
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13
Note that this query includes a DATE_ADD function to add five years to the data. The public data you are using to create this dashboard is from 2014 and 2015, so this is a way to make your dashboard appear more recent. Normally, you would not change the dates in a dataset, but because this is a fictional project, you can include this in your own query. This part of the query is optional; however, your exemplar will appear differently if you don’t include it. You will need to develop a similar query to capture a table with data from July through September to explore summer trends specifically.
Step 3: Finish the job
Once you execute the code, it will take a few moments to process. After the query has finished running, you will be able to download the tables as CSV files by using the Save Results dropdown and selecting the appropriate file type.
This might take a few minutes. Once you have downloaded the table, you will be ready to upload it to Tableau to create your dashboard!
What to Include in Your Response
Be sure to address the following criteria:
- Necessary tables are successfully combined into summary tables
- Appropriate tables are downloaded and ready to upload to Tableau
Reading: Activity Exemplar: Create your target table for Cyclistic
Reading
In this activity, you created target tables to consolidate and store the data you pulled from the Cyclistic datasets. These tables will allow you to develop a dashboard using Tableau in the upcoming end-of-course project activities in the next course. As a BI professional, you will need to be able to use programs such as BigQuery and Dataflow to move and analyze data with SQL. This end-of-course project showcases your ability to do just that.
The exemplar you are about to review will help you evaluate whether you completed the activity correctly. In this case, you might have discovered a solution that works just as well as the exemplar. That’s great! This exemplar is an example of how a BI professional might have approached this challenge. As long as your process achieved the same results, you can move on to the next phase of the project.
If you find that the result you received is different from the exemplar provided, use the exemplar to iterate and adjust your own code.
Exploring the exemplar code
For this activity, you could run the following SQL query to create a summary table for the entire year:
SELECT
TRI.usertype,
ZIPSTART.zip_code AS zip_code_start,
ZIPSTARTNAME.borough borough_start,
ZIPSTARTNAME.neighborhood AS neighborhood_start,
ZIPEND.zip_code AS zip_code_end,
ZIPENDNAME.borough borough_end,
ZIPENDNAME.neighborhood AS neighborhood_end,
DATE_ADD(DATE(TRI.starttime), INTERVAL 5 YEAR) AS start_day,
DATE_ADD(DATE(TRI.stoptime), INTERVAL 5 YEAR) AS stop_day,
WEA.temp AS day_mean_temperature, -- Mean temp
WEA.wdsp AS day_mean_wind_speed, -- Mean wind speed
WEA.prcp day_total_precipitation, -- Total precipitation
-- Group trips into 10 minute intervals to reduces the number of rows
ROUND(CAST(TRI.tripduration / 60 AS INT64), -1) AS trip_minutes,
COUNT(TRI.bikeid) AS trip_count
FROM
`bigquery-public-data.new_york_citibike.citibike_trips` AS TRI
INNER JOIN
`bigquery-public-data.geo_us_boundaries.zip_codes` ZIPSTART
ON ST_WITHIN(
ST_GEOGPOINT(TRI.start_station_longitude, TRI.start_station_latitude),
ZIPSTART.zip_code_geom)
INNER JOIN
`bigquery-public-data.geo_us_boundaries.zip_codes` ZIPEND
ON ST_WITHIN(
ST_GEOGPOINT(TRI.end_station_longitude, TRI.end_station_latitude),
ZIPEND.zip_code_geom)
INNER JOIN
`bigquery-public-data.noaa_gsod.gsod20*` AS WEA
ON PARSE_DATE("%Y%m%d", CONCAT(WEA.year, WEA.mo, WEA.da)) = DATE(TRI.starttime)
INNER JOIN
-- Note! Add your zip code table name, enclosed in backticks: `example_table`
`(insert your table name) zipcodes` AS ZIPSTARTNAME
ON ZIPSTART.zip_code = CAST(ZIPSTARTNAME.zip AS STRING)
INNER JOIN
-- Note! Add your zipcode table name, enclosed in backticks: `example_table`
`(insert your table name) zipcodes` AS ZIPENDNAME
ON ZIPEND.zip_code = CAST(ZIPENDNAME.zip AS STRING)
WHERE
-- This takes the weather data from one weather station
WEA.wban = '94728' -- NEW YORK CENTRAL PARK
-- Use data from 2014 and 2015
AND EXTRACT(YEAR FROM DATE(TRI.starttime)) BETWEEN 2014 AND 2015
GROUP BY
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13
The result of this query is a merged target table that JOINs the public datasets and the zip code table you uploaded.
Additionally, you needed to execute a query that captured data from just the summer season:
SELECT
TRI.usertype,
TRI.start_station_longitude,
TRI.start_station_latitude,
TRI.end_station_longitude,
TRI.end_station_latitude,
ZIPSTART.zip_code AS zip_code_start,
ZIPSTARTNAME.borough borough_start,
ZIPSTARTNAME.neighborhood AS neighborhood_start,
ZIPEND.zip_code AS zip_code_end,
ZIPENDNAME.borough borough_end,
ZIPENDNAME.neighborhood AS neighborhood_end,
-- Since we're using trips from 2014 and 2015, we will add 5 years to make it look recent
DATE_ADD(DATE(TRI.starttime), INTERVAL 5 YEAR) AS start_day,
DATE_ADD(DATE(TRI.stoptime), INTERVAL 5 YEAR) AS stop_day,
WEA.temp AS day_mean_temperature, -- Mean temp
WEA.wdsp AS day_mean_wind_speed, -- Mean wind speed
WEA.prcp day_total_precipitation, -- Total precipitation
-- We will group trips into 10 minute intervals, which also reduces the number of
rowsROUND(CAST(TRI.tripduration / 60 AS INT64), -1) AS trip_minutes,
TRI.bikeid
FROM
`bigquery-public-data.new_york_citibike.citibike_trips` AS TRI
INNER JOIN
`bigquery-public-data.geo_us_boundaries.zip_codes` ZIPSTART
ON ST_WITHIN(
ST_GEOGPOINT(TRI.start_station_longitude, TRI.start_station_latitude),
ZIPSTART.zip_code_geom)
INNER JOIN
`bigquery-public-data.geo_us_boundaries.zip_codes` ZIPEND
ON ST_WITHIN(
ST_GEOGPOINT(TRI.end_station_longitude, TRI.end_station_latitude),
ZIPEND.zip_code_geom)
INNER JOIN
-- https://pantheon.corp.google.com/bigquery?p=bigquery-public-data&d=noaa_gsod
`bigquery-public-data.noaa_gsod.gsod20*` AS WEA
ON PARSE_DATE("%Y%m%d", CONCAT(WEA.year, WEA.mo, WEA.da)) = DATE(TRI.starttime)
INNER JOIN
-- Note! Add your zipcode table name, enclosed in backticks: `example_table`
`legalbi.sandbox.zipcodes` AS ZIPSTARTNAME
ON ZIPSTART.zip_code = CAST(ZIPSTARTNAME.zip AS STRING)
INNER JOIN
-- Note! Add your zipcode table name below, enclosed in backticks: `example_table`
`legalbi.sandbox.zipcodes` AS ZIPENDNAME
ON ZIPEND.zip_code = CAST(ZIPENDNAME.zip AS STRING)
WHERE
-- Take the weather from one weather station
WEA.wban = '94728' -- NEW YORK CENTRAL PARK
-- Use data for three summer months
AND DATE(TRI.starttime) BETWEEN DATE('2015-07-01') AND DATE('2015-09-30')
This query results into a similar table as the previous query, except it focuses on trends from July through September.
Key takeaways
Storing data from multiple sources in target tables allows you to access and use consolidated data for reporting purposes. In the Course 3 end-of-course project, you will use the table you’ve created in this activity to design a dashboard and share insights with the Cyclistic product development team in order to help guide their process and make informed decisions.
Google Fiber scenario
Reading: Course 2 workplace scenario overview: Google Fiber
Reading
Previously, you interviewed with the customer call center team at Google Fiber, a provider of fiber optic internet. Your interviewers asked you to complete a mock project based on the work performed by their business intelligence team. At the end of the last course, you consulted with stakeholders to develop project planning documents that establish their needs and expectations. The strategy and planning documents are key to helping you understand important details about this project, so you’ll want to make sure you have completed this step before moving on. Refer to the exemplar documents here: Activity Exemplar: Complete the business intelligence project documents for Google Fiber. Use these exemplars to ensure that your own documents are completed with enough detail to move on to the next part of the project.
Coming up, you are going to build on previous work to combine data from the tables into one reporting table you will use to develop a dashboard for stakeholders. The activities will guide you through uploading the data into your own project space, using SQL code in Dataflow or BigQuery, observing how stakeholders interact with data, and finalizing a reporting table to be used for the dashboard. There will be an exemplar project for you to review so you can check your work and continue to iterate on your own project.
This next step in your project will provide more opportunities to create additional workplace samples you can share with potential employers to demonstrate your BI skills and knowledge. This is also a great opportunity to practice the skills you have been learning so far.
Key takeaways
The end-of-course project is designed for you to practice and apply course skills in a fictional workplace scenario. By completing each course’s end-of-course project, you will have work examples that will enhance your portfolio and showcase your skills for future employers.
Reading: Google Fiber datasets
Reading: [Optional] Merge Google Fiber datasets in Tableau
Practice Quiz: Activity: Create your target table for Google Fiber
Reading: Activity Exemplar: Create your target table for Google Fiber
Reading
In this activity, you created a target table to consolidate and store the Google Fiber datasets. This table will allow you to develop a dashboard using Tableau in the upcoming end-of-course project activities in the next course. As a BI professional, you will need to be able to use programs such as BigQuery and Dataflow to move and analyze data with SQL. This end-of-course project showcases your ability to do just that.
The exemplar you are about to review will help you evaluate whether you completed the activity correctly. In this case, you might have discovered a solution that works just as well as the exemplar. That’s great! This exemplar is an example of how a BI professional might have approached this challenge. As long as your process achieved the same results, you can move on to the next phase of the project.
If you find that the result you received is different from the exemplar provided, use the exemplar to iterate and adjust your own code.
Exploring the exemplar code
For this activity, you could run the following SQL query to create a single combined table that merged all three of the datasets your were given:
SELECT
date_created,
contacts_n,
contacts_n_1,
contacts_n_2,
contacts_n_3,
contacts_n_4,
contacts_n_5,
contacts_n_6,
contacts_n_7,
new_type,
new_market
FROM `your project.fiber.market_1`
UNION ALL
SELECT
date_created,
contacts_n,
contacts_n_1,
contacts_n_2,
contacts_n_3,
contacts_n_4,
contacts_n_5,
contacts_n_6,
contacts_n_7,
new_type,
new_market
FROM `your project.fiber.market_2`
UNION ALL
SELECT
date_created,
contacts_n,
contacts_n_1,
contacts_n_2,
contacts_n_3,
contacts_n_4,
contacts_n_5,
contacts_n_6,
contacts_n_7,
new_type,
new_market
FROM `your project.market_3`
The UNION ALL statement is applied here instead of a JOIN statement because the tables already have matching columns, making them easy to merge completely.
After you have run this query, you should have a combined table like this:
Create a target table for Google Fiber exemplar
Key takeaways
Storing data from multiple sources in target tables allows you to access and use consolidated data for reporting purposes. In the Course 3 end-of-course project, you will use the table you’ve created in this activity to design a dashboard and share insights with the Google Fiber stakeholders in order to help guide their process and make informed decisions.
End-of-course project wrap-up
Video: Tips for ongoing success with your end-of-course project
At the end-of-course project stage, you have learned a lot of transferable skills, such as understanding and observing how stakeholders can use data to meet their needs, understanding different types of databases and storage systems, and understanding the logic behind pipelines that ingest, transform, and deliver data. These skills are worth highlighting in job interviews, regardless of the tools required for the position.
When communicating your end-of-course project to interviewers, keep in mind that they may not be BI professionals and may not have the same detailed understanding of BI processes as you do. Tailor your presentation to be relevant to their needs and answer the key questions they have about you.
Coming up, you will learn how to present BI insights to stakeholders and provide them with access to relevant data. By the end of the courses, you will wrap up your end-of-course project by creating dashboards and presentations. This will give you a complete case study for your portfolio.
At this point in your
end-of-course project, you’ve done a lot of work already. You collaborated with stakeholders
to determine their needs and create important
project planning documents; then you used those to
explore the project data and build systems to deliver the data. As you continue to work on
your end-of-course project, you’ll want to consider how
you can highlight your process and explain what you’ve done to potential employers
and hiring managers. First, it’s important to recognize that, as a BI professional, you may be asked to learn
and adapt to new tools. There are a lot of great
solutions out there, and different businesses
use different BI tools depending on their needs. Keep in mind that you have learned a lot of transferrable skills
that can be applied across different tools. You’ve also learned the
importance of understanding and observing how
stakeholders can use data in order to better meet their data needs; you recognize the different
types of databases and storage systems and
how they can be useful in a larger database system;
and you understand the logic behind pipelines that ingest,
transform, and deliver data. These are all skills worth
highlighting in job interviews, no matter what tools
the position requires. In addition, always be sure
to consider your audience. As you have been learning
throughout these courses, you will often work with
different kinds of stakeholders who have different levels
of technical know-how. When you communicate with them, keep in mind who they are
and what their goals are, and what they already know
and what they need to know. This is just as true
when you’re discussing your end-of-course
project with interviewers. Often, there will be people conducting or joining your interview who aren’t necessarily BI professionals. For example, hiring managers may not have the same detailed understanding
of BI processes as you do. In order to keep your
presentation relevant to them, try to remember those key
questions about your audience. Your interviewers have
a business challenge, just like stakeholders on a BI project: They have an open job position to fill. Think about what they
need to know about you that makes a decision that solves that. Coming up, you’re going to learn all about presenting BI insights
to your stakeholders and providing them with
access to relevant data they can use to make decisions. By the end of these courses, you’ll wrap up your end-of-course project by creating dashboards and presentations. In no time, you’ll have
a complete case study for your portfolio.
Video: Luis: Tips for interview preparation
Before the interview
- Research the role, the company, and the team. This will help you to understand what the role entails, what the company’s culture is like, and what the team is working on.
- Prepare a document that outlines the role, the typical questions and interview types, the team and organization, and the projects that help the team grow and excel. This will help you to stay focused and organized during your interview preparation.
- Prepare your technical side. This includes reviewing your portfolio, understanding the impact of your work, and being able to explain why you applied certain solutions instead of others.
- Be prepared to discuss your next steps. What can be done on top of your current solution?
During the interview
- Focus on impact. Be sure to highlight the impact of your work in your answers. What were the outcomes? How did you measure them?
- Be prepared to sell yourself. Speak confidently about your skills and experience.
- Rehearse your interview with someone else. This will help you to practice your delivery and receive feedback.
After the interview
- Network with people in the field. Attend meetups and conferences to connect with other data professionals.
Luis also emphasizes that it is important to remember that you are not alone on this journey. There are many people who can help you increase your network and make connections.
my name is Luis and I’m a data and
analytics manager at google. When I’m preparing for an interview,
I like to prepare a document where I describe what is the role,
What are the typical questions and the typical type of interviews
that this company does, What is the team or the organization
this role I’m applying for? What are the projects that
help this team to grow and and to Excel and
also to prepare the technical side of it. So prepare the portfolio,
understand the impact that I want to showcase what is aligned with
the company’s strategy and really take the time at least two weeks
to really be ready for a first interview. A portfolio is a collection of your
past experience and the past projects that you think are useful to showcase
to someone that is interviewing you. It’s basically your brand and you need to make sure that the project
showcase the value of your brand. The common mistake that I usually see is the lack of showcasing impact
on the technical side, ensure that you understand why
you created a data solution. What was the outcome? What is the impact? How did you measure that on top of that? Make sure that you also
are capable of explaining why you applied a certain solution
instead of others and also make sure that you define and
you are able to explain the next steps, what can be done on top of this
solution when you are preparing for an interview, make sure that you rehearse
the interview with someone else. It’s important that you make sure that you
are able to, you tell the right message that you focus on impact that you are able
to sell yourself the best way you can. So you need to be really prepared for
the interview and rehearsing with someone that you trust
that you are comfortable with receiving feedback is really key to make you ready
to answer the interview questions. If B.
I. Is really the path that
you want to pursue, this is the very good first
step to take into it. But you should not stop here,
go out and talk with people. There are multiple meetups that
happen around data around BI, so don’t feel that you
are alone on this on this road. There are multiple people that can
help you increase your network, make connections you’ll be able to share and
you also receive knowledge from others.
Course review: The Path to Insights: Data Models and Pipelines
Video: Course wrap-up
This course covered database modeling, design patterns, data pipelines, data storage systems, BI tools, and stakeholder engagement. In the next course, you will learn how to design visualizations and dashboards for BI and present those insights.
Key takeaways:
- Database modeling and design patterns are essential for creating efficient and scalable database systems.
- Data pipelines and storage systems are used to get data where it needs to go and transform it to be useful.
- BI tools can be used to interact with stakeholders and present data in a way that is accessible and useful for decision-making.
- BI processes and systems are often iterative, meaning that they need to be constantly updated to meet changing needs.
Next steps:
Learn how to design visualizations and dashboards for BI and present those insights.
Congratulations on
completing another course. You’re that much
closer to finishing this program and receiving
your certificate. You’ve learned a lot already. Take a moment to
consider everything we’ve covered in this
course and celebrate. For example, you learned
about database modeling, design patterns, and how to use database schema to describe
those design patterns. You also discovered
many kinds of databases and how they have different uses within
a database system. Next, you explore data pipelines and how ETL and ELT processes help get data where
it needs to go and transform it to be useful
during that process. You also learned about
different data storage systems that you might use
in a pipeline. Additionally, you explored
more common BI tools and how to interact with
stakeholders effectively. You even had a chance to
create your own pipeline. Then by exploring database
and pipeline optimization, you were able to consider how BI processes and systems
are often iterative. Coming up, you have even more exciting
discoveries to make. Now that you understand
how to create systems to deliver
data to stakeholders, it’s time to start thinking
about how to present that data and make it accessible and useful
for decision-making. In the next course,
you’re going to learn more about how to design visualizations and
dashboards for BI and present those insights. Great work so far.
Reading: Course 2 glossary
Reading
Reading: Get started on Course 3
Reading
Coming up next…
Congratulations on completing another course in the Google Business Intelligence Certificate! In this part of the program, you learned more about data modeling and ETL processes for extracting data from source systems, and transforming it into formats that better enable analysis and drive business processes and goals.
The entire program has three courses:
- Foundations of Business Intelligence: In this course, you will discover the role of business intelligence professionals within an organization, as well as typical career paths. You will explore core BI practices and tools and how BI professionals use them to make an impact.
- The Path to Insights: Data Models and Pipelines: This is the course you’ve just completed.
- Decisions, Decisions: Dashboards and Reports: In this course, you will apply your knowledge of business intelligence and data modeling to create dynamic dashboards that track KPIs to meet stakeholder needs.
Now that you have completed this course, you are ready to move on! To continue with the program, you can go to the next course by clicking this link: Decisions, Decisions: Dashboards and Reports.
Keep up the great work!