Build a small ETL pipeline that fetches JSON from a public API, transforms it with pandas, and writes a CSV – all orchestrated by Prefect.
DataFrame
.@task
– wrap any function in retries & observability.log_prints
– surface print()
logs automatically.fetch_page
task – Downloads a single page with retries.to_dataframe
task – Normalises JSON to a pandas DataFrame.save_csv
task – Persists the DataFrame and logs a peek.etl
flow – Orchestrates the tasks sequentially for clarity.if __name__ == "__main__"
with some basic configurations kicks things off.fetch_page
, to_dataframe
, save_csv
).fetch_page
call downloaded a page and, if it failed, would automatically retry.log_prints=True
flag logs messages inside the flow body; prints inside tasks are displayed in the console).save_csv
for a database loader or S3 upload with one small change.etl
flow and run it with different parameters from another flow.