← Back to Tracks

Learning Tracks: Working With Data Teams

If you work with data teams as part of your day to day, you'll need a strong technical foundation. This learning track will break down what concepts and tools you'll need to understand to be a great partner to all different types of data teams. And impress your boss.

The basics

Whether you're working with analytics, data science, or ML, there are some important basics that all data work starts with. Nail these down and you'll be ready to get into more role-specific stuff.

🚨 What you need to know

What do data teams even do? Start by reading about the basic jobs to be done for data teams.
At SaaS companies, product analytics is a big part of what data teams do.
You can read an overview of different parts of the data stack here.

🚧 What you should know

The basic language of data teams is SQL, and it's very learnable.
An important role of data teams is helping measure initiatives via experimentation.
A new slew of tools build around cloud data warehouses are called The Modern Data Stack" but it's mostly a marketing gimmick.

Where data comes from

To get powerful models and nice dashboards, the data needs to come from somewhere - and it's usually a mish mosh of sources from around your business.

🚨 What you need to know

Data for analytics comes from across your business: your user and app data, and third party tools like Stripe and Salesforce.
Relational databases are the ABCs of backends: they're where you store the data your app needs, like your users and their settings.

You can go more in depth on production databases.
PAID

🚧 What you should know

NoSQL databases are another popular way to store data, with less structure and more flexibility.

Where data is stored

Once data teams have their source data in order, they usually store it in a special database designed specifically for analytics and data science.

🚨 What you need to know

These days, most teams store their analytics data in a cloud-based data warehouse.

You can go more in depth on cloud data warehouses.
PAID

🚧 What you should know

A popular but less organized storage format is called a Data Lake.

⌨️ Tools and products

Snowflake is the most popular cloud data warehouse, and was the biggest tech IPO ever.
PAID
Elastic is an analytics database specifically built for searching through unstructured data.
PAID
MongoDB is a popular type of NoSQL database for applications.

How data gets moved around

Source data is rarely in the format data teams need it in, so they need to transform it into the right form and shape. This is sometimes done before moving it into the warehouse (ETL), and sometimes done after (ELT).

🚨 What you need to know

Transforming data usually gets called ETL, short for extract, transform, and load.

You can go more in depth on ETL.
PAID

⌨️ Tools and products

dbt is an increasingly popular tool for transforming and organizing your warehouse data.
PAID
Kafka is a powerful tool built at LinkedIn for streaming event data in real time.
Segment helps data teams collect analytics events and send them to the tools they need to be in
Databricks is a tool for running Spark jobs, basically ETL for big data.
PAID

How data gets used

Once cleaned, organized data is in the warehouse, you can do anything with it, from dashboards to operations to ML models.

🚨 What you need to know

Reverse ETL is the process of getting data from the warehouse to tools like Salesforce and Hubspot.
Most data teams use a special type of code notebook to explore and analyze their data.

🚧 What you should know

A language-based ML model named GPT-3 took the world by storm.
For anyone who has seen or used ChatGPT or DALL-E, ML and AI have been advancing quickly over the past few years.

🚦 What's nice to know

Product analytics tools are starting to use more data from the warehouse instead of clickstream events.
Session Replay allows data teams to get a qualitative view into what their users are doing in the product.

⌨️ Tools and products

Kafka is a popular tool for streaming event data in real time.
Segment helps data teams collect analytics events and send them to the tools they need to be in
Databricks is a tool for running Spark jobs, basically ETL for big data.
PAID

Technically learning tracks help make the world of software simple and digestible, so you can be better at your job. There are more on the way!

Ideas for other learning tracks? Ways we can improve this one? Let us know.

Learning Tracks: Working With Data Teams

The basics

🚨 What you need to know

🚧 What you should know

Where data comes from

🚨 What you need to know

🚧 What you should know

Where data is stored

🚨 What you need to know

🚧 What you should know

⌨️ Tools and products

How data gets moved around

🚨 What you need to know

⌨️ Tools and products

How data gets used

🚨 What you need to know

🚧 What you should know

🚦 What's nice to know

⌨️ Tools and products

Support

Sponsorships

Twitter

Linkedin

Privacy Policy