News dataset from edition
This dataset contains over 27,000 news articles sourced from CNN.com, including full content, metadata, and media fields. Each article is enriched with publish dates, author information, descriptions, and full raw + cleaned content—perfect for media research, sentiment analysis, topic modeling, and natural language processing (NLP) projects.
Last crawled in July 2021, this collection offers a historical snapshot of CNN’s reporting and editorial content.
Use Cases:
-
News content analysis
-
Fake news detection & bias tracking
-
Topic classification and clustering
-
Training AI/NLP models
-
Historical news trend research
-
Media monitoring tools
Update Frequency:
Archived — no current updates, great for snapshot-based analysis
Last crawled:
July 2021
Data points:
title, url, published_at. last_modified_at, author, short_description, header_image, raw_content, content, crawled_at, _id, source
Data points count:
11
Total Downloads
5 +
Total Views
714
Sample dataset:
Availability or Type:
Immediately
Delivery time:
immediately