What is a data pipeline?

Data pipeline, or pipeline, is a series of data processing steps. First, Data is ingested at the beginning of the pipeline. Then, there are a series of steps where the output of one step is the input of the next one. This continues until the pipeline is complete. The steps of a pipeline are often executed in a parallel or time-sliced fashion.

svg viewer

Data pipelines consist of three key elements: a source, a processing step or steps, and a destination. The source may be a database, an application, or a cloud. The output may be data consumers like a machine learning or data visualization algorithm or even another database.

Data pipelines enable the flow of data from, for example, an application to a data warehouse, a data lake to an analytics database, or into a payment processing system.

Common processing steps in data pipelines include data transformation, augmentation, enrichment, filtering, grouping, aggregating, and the running of algorithms against that data.

New on Educative
Learn to Code
Learn any Language as a beginner
Develop a human edge in an AI powered world and learn to code with AI from our beginner friendly catalog
🏆 Leaderboard
Daily Coding Challenge
Solve a new coding challenge every day and climb the leaderboard

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved