4 steps to Develop Your News Aggregation Platform
Under the Hood
The story begins because of the bombardment of the news and content.
Too much distraction is disturbing my ability to stay focused and FOMO was always pushing back to consuming content.
I have been reading and watching online content like a madman for the past 2/3 months and there are reasons for the same
- To consume the news trending around the globe
- To research and get my next non-tech business idea
- To strike my next online business idea
- To consume information and gain knowledge
- To learn coding, designing and sales/marketing Yes, I have an agenda and reason to use the internet on social media.
I read Twitter in my free time, use Instagram to watch cool designs and so on.
We all did the same just like the time is wasted a lot in binge-watching rather than gaining information or good output.
Why aggregator?
We need to learn the art of using the internet wisely.
Too much information is not good, creates confusion and leads to fewer decision-making situations.
Think about the last time you weren’t confused when you searched about your career or next opportunity and information on the internet.
We are hardly able to think on our own, we need at least one social media platform assistance and help to make the final decision.
I don’t want to see that happening with me in the long-term and that is why instead of having too many sources I thought to first bring them in one place.
4 steps to create Aggregation News Platform
- Collect the platforms you want to get onboard along with the link and RSS
- Create an endpoint to scrap the data via the RSS URL, usually, one API call will work fine
- Create a frontend interface to showcase all the fetched data from RSS urls of the onboarded platform
- Integrate the endpoint API to the frontend interface and render/showcase all the platform's latest feeds or news in one place Yes in these 4 steps, your aggregation or Google News clone platform is ready.
RSS feed Platforms
Collecting RSS feeds of the news platforms along with all the news platforms is an easy task.
You can use chatGPT for the same and get the extensive list of all platforms along with website, description, and RSS URL in one table or JSON format.
I did the same using the following prompt
Give me 30 news websites in a table with the following as the column,
RSS feed URL, description, website link and name
This will return a table but you can go with JSON format also.
Data Fetching
This is again an easy task.
Using axios
and rss-scraper
npm module we can simply scrap the data just from the RSS feed url.
import axios from 'axios';
import RssParser from 'rss-parser';
export const scrapFromRSS = async(url) => {
const res = await axios.get(url);
const parser = new RssParser();
const data = await parser.parse(res.data);
return data.data
};
The feeds key contains the array of an item that is the content from the RSS url or content of the corresponding website whose data we are fetching.
API endpoint
Once the method is ready we need to create a dedicated endpoint for the same. This endpoint will repeat the same method for multiple platforms and fetch all the data for us in one place.
import { scrapFromRSS } from 'utils';
export const fetchDataFromRSS = async() => {
const platforms = ['', '', ''] // with rss urls
let feeds = [];
platforms.forEach(async(item)=> {
const response = await scrapFromRSS(item);
feeds.push(item.data.feeds);
});
res.send(feeds)
};
export default router.get('/fetchDataFromRSS', fetchDataFromRSS);
Our feeds API is ready and this will return all the aggregated data.
Frontend Interface
Developing the front end is not hard.
My preferred stack is the following
- Next.js for framework
- React.js language
- Tailwind CSS for styling
- Shadcn UI or Mantine.dev or Material UI or chakraUI for UI
- Vercel for hosting
- Github for version control
- React framer for animation
- Supabase or Firebase for database
We can use Firebase and Supabase in the above endpoint to store the latest feeds in the database to avoid making multiple calls to the endpoint.
Redis is a preferred choice for adding caching to the database.
CRON job can be used to schedule the refetching of the latest data from RSS feeds every hour.
Frontend API integration
The only part of the front end is to integrate the API.
Axios npm module will simply work well in this case.
const feeds = await axios.get(fetchDataFromRSS);
return feeds.data;
New Platform Addition API
We need another endpoint to keep adding new platforms to our platforms list.
This platform's list is iterated by the feeds endpoint to fetch and return the data to us so checking this platform list becomes important.
Along with RSS url checking because most of the time route of the URL can be changed or edited.
This new addition of the platform API endpoint will simply check the RSS feed URL authenticity and whether it works well or not before pushing it into the database.
import { scrapFromRSS } from 'utils';
const addNewPlatform = async(url) => {
const response = await scrapFromRSS(url);
const isValid = response.data.feeds.length > 0 ? true: false;
if(isValid){
// add this to the database collection of platforms
}else {
throw new Error('Invalid platform or RSS feed URL')
};
};
We have now automated the process to push the new platform easily to our database and aggregation collection.
Conclusion
This is just a 5-minute read to simply the way to develop a news aggregation platform.
- Collect platform with RSS url whose data we want to fetch
- Create an endpoint to fetch data using RSS scraper
- Integrate the endpoint in the frontend and render all the feeds
- Showcase the UI along with creating an endpoint to keep adding new platforms. I’ve already created this platform but not sure whether I want to open it to the public because of the revenue stream and no business plan.
Once this is finalised I’ll push the platform will cool features along with cool UI design.
Do give it a try if it doesn’t work well it will become your good resume project to land a nice high-paying job.
See you in the next one
Shrey