How to Scrape LinkedIn

Crawlbase - Jun 21 - - Dev Community

This blog was originally posted to Crawlbase Blog

LinkedIn is one of the best platforms to get a job in the corporate world, as both companies and professionals are using its job postings for employment and career goals.

Scraping LinkedIn can unlock a wealth of data for businesses, researchers, and job seekers. Whether you're looking to gather information on potential job candidates, monitor company activities, or analyze industry trends, scraping LinkedIn profiles, company pages, and feeds can be incredibly valuable.

In this blog, we'll explore how to effectively use Crawlbase's Crawling API to scrape LinkedIn in Python. Crawlbase offers a robust solution for extracting data from LinkedIn, providing specific scrapers for profiles, companies, and feeds. By the end of this guide, you'll know how to set up your environment, use the Crawlbase API, and retrieve your scraped data efficiently.

Why Scrape LinkedIn?

LinkedIn is a goldmine of professional information. With over 700 million users, it offers a treasure trove of data on professionals, companies, job postings, and industry insights. Here are some compelling reasons to scrape LinkedIn:

  1. Talent Acquisition: For recruiters and HR professionals, a LinkedIn scraper can be used to sift through profiles and collect information on potential job candidates. This way it becomes easier to fill positions with right talent in no time.
  2. Organisation: Businesses can use a LinkedIn data scraper to keep a watchful eye on competitors, the direction of the market and take a look at industry benchmarks. This data helps in strategic planning and decision-making.
  3. Sales/Lead Generation: Sales teams can scrape LinkedIn profiles to gather leads, monitor them for use by cold callers, or develop targeted outreach strategies. Sales folks comb over LinkedIn profiles to understand more about the background and interests of the people they sell to.
  4. Academic Research: Scholars can either scrape data from LinkedIn by using LinkedIn scraper tools and collect necessary datasets for academic research on job trends, industry trends, business development, and how professionals network.
  5. Job Searching: Job seekers can benefit from using a LinkedIn job scraper to keep track of job postings, understand company hiring patterns, and tailor their applications based on insights gained from company profiles.

LinkedIn scraping allows you to scrape a huge amount of data from LinkedIn that will have been very tricky to source manually. In the following sections, we are going to discuss what you can scrape from LinkedIn, the problems you may face and how to use Crawlbase's Crawling API for LinkedIn scraping.

What Can We Scrape from LinkedIn?

When we talk about scraping LinkedIn we need to understand what type of data we can scrape. With the right LinkedIn scraper, we can scrape alot of information that can be beneficial for different reasons. Here is a summary of data-points you can scrape from LinkedIn:

Profiles:

  • Personal Information: Names, job titles, current and past positions, education, skills, endorsements, and recommendations.
  • Contact Information: Emails, phone numbers (if publicly available), and social media profiles.
  • Activity and Interests: Posts, articles, and other content shared or liked by the user.

Company Pages:

  • Company Details: Name, industry, size, location, website, and company description.
  • Job Postings: Current openings, job descriptions, requirements, and application links.
  • Employee Information: List of employees, their roles, and connections within the company.
  • Updates and News: Company posts, articles, and updates shared on their page.

Feeds:

  • Activity Feed: Latest updates, posts, and articles from users and companies you are interested in.
  • Engagement Metrics: Likes, comments, shares, and the overall engagement of posts.
  • Content Analysis: Types of content being shared, trending topics, and user engagement patterns.

By using a LinkedIn profile scraper, LinkedIn company page scraper, or a LinkedIn feeds scraper, we can scrape this information. This data may be utilized for talent acquisition, market research, lead generation, or academic research.

In the subsequent sections, we will highlight LinkedIn scraping issues, introduce Crawlbase's Crawling API, and share how you can prepare your environment and use the various LinkedIn scrapers that Crawlbase has.

Potential Challenges of Scraping LinkedIn

Scraping LinkedIn can provide valuable data, but it also comes with its challenges.

Anti-Scraping Measures:

  • IP Blocking: LinkedIn employs IP blocking where if too many requests are made from the same IP over a short period that IP gets blocked. Avoid this by using a rotating proxy service or by implementing a request delay.
  • CAPTCHAs: LinkedIn could show CAPTCHAs to ensure the requests are being done by a human This obstruction can be solved either through automatic CAPTCHA-solving services, or manual intervention.

Dynamic Content:

  • LinkedIn pages are rendered via JavaScript. Traditional scraping methods may not capture this data effectively. You can use headless browsers or services such as Crawlbase Crawling API that render JavaScript to scrape dynamic content.

Legal and Ethical Considerations:

  • Terms of Service: Scraping LinkedIn may violate their terms of service. It’s crucial to understand the legal implications and ensure that your scraping activities comply with LinkedIn’s guidelines and data privacy laws.
  • User Consent: Collecting data from user profiles should be done with respect for privacy. Avoid scraping sensitive information and use the data responsibly.

Data Volume and Storage:

  • Large Data Sets: Scraping large volumes of data can be challenging in terms of processing and storage. Ensure that you have adequate infrastructure to handle and store the data you collect.
  • Data Quality: Scraped data can sometimes be incomplete or contain errors. Implement validation checks and clean the data to ensure its quality and usability.

By being aware of these issues, you can plan your LinkedIn scraping strategy more effectively. In the next sections, we will discuss how to use Crawlbase’s Crawling API for LinkedIn scraping, including setting up your environment and using specific scrapers for profiles, company pages, and feeds.

Crawlbase Crawling API for LinkedIn Scraping

Crawlbase provides a powerful Crawling API that simplifies the process of scraping LinkedIn. Designed with developers in mind, the API can be integrated quickly into your existing systems. By using Crawlbase’s LinkedIn scrapers, you can efficiently gather data from profiles, company pages, and feeds. Here’s a brief overview of how Crawlbase’s Crawling API can help you scrape LinkedIn:

API Overview:

The Crawling API allows you to make HTTP requests to LinkedIn pages and retrieve the necessary data. It supports both GET and POST requests and handles dynamic content using headless browsers.

Anonymity:

Crawlbase uses worldwide rotating proxies with 99.9% up-time, ensuring your scraping activities remain anonymous and undetectable. This feature is crucial when dealing with platforms like LinkedIn that have strict anti-scraping measures.

Authentication:

You will need an API token to authenticate your requests. Crawlbase provides two types of tokens: one for normal requests and another for JavaScript-enabled requests.

Rate Limits and Response Times:

The API supports up to 20 requests per second per token, ensuring efficient data retrieval. The average response time is between 4 to 10 seconds.

Handling Asynchronous Requests:

For LinkedIn scraping, you will often use asynchronous requests to manage large volumes of data. Crawlbase provides a unique request identifier (rid) for each asynchronous request, which you can use to retrieve the stored data later.

Next, we will guide you through setting up your environment to use Crawlbase's Crawling API and provide detailed examples for scraping LinkedIn profiles, company pages, and feeds.

Setting Up Your Environment

To scrape LinkedIn using Crawlbase’s Crawling API, you need to set up your Python environment. Here’s a step-by-step guide:

Install Python:

Download and install Python from the official website. Ensure that you add Python to your system’s PATH during installation.

Create a Virtual Environment:

Open your terminal or command prompt and navigate to your project directory. Create a virtual environment by running:

python -m venv venv
Enter fullscreen mode Exit fullscreen mode

Activate the virtual environment:

  • On Windows:
  .\venv\Scripts\activate
Enter fullscreen mode Exit fullscreen mode
  • On macOS/Linux:
  source venv/bin/activate
Enter fullscreen mode Exit fullscreen mode

Install Crawlbase Library:

With the virtual environment activated, install the Crawlbase library using pip:

pip install crawlbase
Enter fullscreen mode Exit fullscreen mode

Choose an IDE:

For writing and running your Python scripts, you can use any Integrated Development Environment (IDE) like PyCharm, VS Code, or Jupyter Notebook.

Create a Python Script:

Open your chosen IDE and create a new Python file, for example, scrape_linkedin.py. This script will contain the code to interact with Crawlbase’s API and scrape LinkedIn data.

By setting up your environment properly, you ensure a smooth workflow for scraping LinkedIn. In the next sections, we’ll dive into specific examples of using Crawlbase’s LinkedIn scrapers to extract data from profiles, company pages, and feeds.

Crawlbase LinkedIn Profiles Scraper

Using Crawlbase’s LinkedIn profile scraper, you can easily extract detailed information from LinkedIn profiles. Here’s a step-by-step guide to scraping a LinkedIn profile:

Scraping a LinkedIn Profile:

Start by importing the necessary libraries and initializing the Crawlbase API with your access token. Define the URL of the LinkedIn profile you want to scrape and set the scraping options.

from crawlbase import CrawlingAPI
from bs4 import BeautifulSoup
import json

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({ 'token': 'YOUR_API_TOKEN' })

URL = 'https://www.linkedin.com/in/kaitlyn-owen'

options = {
    'scraper': 'linkedin-profile',
    'async': 'true'
}

# Function to make a request using Crawlbase API
def make_crawlbase_request(url):
    response = crawling_api.get(url, options)
    if response['status_code'] == 200:
        return json.loads(response['body'].decode('latin1'))
    else:
        print("Failed to fetch the page. Status code:", response['status_code'])
        return None

def scrape_profile(url):
    try:
        json_response = make_crawlbase_request(url)
        if json_response:
            return json_response
    except Exception as e:
        print(f"Request failed: {e}")

    return None

if __name__ == '__main__':
    scraped_data = scrape_profile(URL)
    print(json.dumps(scraped_data, indent=2))
Enter fullscreen mode Exit fullscreen mode

This script initializes the Crawlbase API, defines the URL of the LinkedIn profile to scrape, and uses the linkedin-profile scraper. It makes an asynchronous request to fetch the profile data and prints the JSON response.

Example Output:

{
  "rid": "1dd4453c6f6bd93baf1d7e03"
}
Enter fullscreen mode Exit fullscreen mode

Retrieving Data from Crawlbase Storage API:

When using asynchronous requests, Crawlbase saves the response and provides a request identifier (rid). You need to use this rid to retrieve the data.

from crawlbase import StorageAPI
import json

# Initialize Crawlbase Storage API with your access token
storage_api = StorageAPI({ 'token': 'YOUR_API_TOKEN' })

RID = 'your_request_identifier'

# Function to retrieve data from Crawlbase storage
def retrieve_data(rid):
    response = storage_api.get(f'https://api.crawlbase.com/storage?rid={rid}')
    if response['status_code'] == 200:
        return json.loads(response['body'].decode('latin1'))
    else:
        print("Failed to retrieve the data. Status code:", response['status_code'])
        return None

if __name__ == '__main__':
    retrieved_data = retrieve_data(RID)
    print(json.dumps(retrieved_data, indent=2))
Enter fullscreen mode Exit fullscreen mode

This script retrieves the stored response using the rid and prints the JSON data.

Example Output:

{
  "title": "Kaitlyn Owen",
  "headline": "",
  "sublines": ["Miami-Fort Lauderdale Area", "5K followers", "500+ connections"],
  "location": "Miami-Fort Lauderdale Area",
  "coverImage": "https://media.licdn.com/dms/image/D4E16AQHW1GnvvOebbQ/profile-displaybackgroundimage-shrink_200_800/0/1710246724829?e=2147483647&v=beta&t=i-PEK8cxRdvov4ZERUJB6Pp9eh5jIh3LrysrpQbjgLM",
  "profileImage": "https://media.licdn.com/dms/image/C5603AQE5W6ovXILrAA/profile-displayphoto-shrink_200_200/0/1654018869301?e=2147483647&v=beta&t=WZ2BqDnTi6lOIWxNDdrnLkchmg0FparKWWU53NCaCuQ",
  "profileUrl": "https://www.linkedin.com/in/kaitlyn-owen",
  "positionInfo": {
    "company": "",
    "link": "",
    "image": null
  },
  "educationInfo": {
    "school": "",
    "link": "",
    "image": null
  },
  "websiteInfo": {
    "title": "",
    "link": ""
  },
  "summary": ["I am a self-motivated professional who is passionate about helping surgeons personally…"],
  "activities": [
    {
      "title": "With permission - 4 years after explantation of an infected aortic graft placed at another local institution. Playing golf and loving life. Best…",
      "link": "https://www.linkedin.com/posts/peter-rossi-md-facs-dfsvs-9393b934_aorta-aortaed-activity-7185799259269525504-DI5k?trk=public_profile",
      "image": "https://media.licdn.com/dms/image/D5622AQFKrMD3lTsK3w/feedshare-shrink_2048_1536/0/1713228047686?e=2147483647&v=beta&t=eZ4Blo9-IEPoDaF7TgUQbm-gFtDmRGTaW1uZOqLWEM4",
      "attributions": {
        "title": "Liked by Kaitlyn Owen",
        "link": "https://www.linkedin.com/in/kaitlyn-owen?trk=public_profile_actor-name"
      }
    },
    {
      "title": "Proud, honored and humbled immediately came to mind when I opened this award! Proud of all the hard work, honored to work for such a phenomenal…",
      "link": "https://www.linkedin.com/posts/tinaharris0214_orthopedicsurgeryteam-2023presidentsclub-activity-7189045084422631425-7ZcG?trk=public_profile",
      "image": "https://media.licdn.com/dms/image/D4D22AQGl_nS5GjrxMQ/feedshare-shrink_2048_1536/0/1714001912596?e=2147483647&v=beta&t=zLnx3M-7NVU2hbb4sdKZxdkhjMkvzCJg8smuLjtg49M",
      "attributions": {
        "title": "Liked by Kaitlyn Owen",
        "link": "https://www.linkedin.com/in/kaitlyn-owen?trk=public_profile_actor-name"
      }
    },
    {
      "title": "Great read for anyone considering locum tenens. If you are interested in learning more about how you can use locums to pay off debts, or gain…",
      "link": "https://www.linkedin.com/posts/kaitlyn-owen_the-flexibility-and-financial-freedom-of-activity-7158495374440054784-_aGb?trk=public_profile",
      "image": "https://media.licdn.com/dms/image/sync/C4D27AQFz0Posz0Y1zg/articleshare-shrink_1280_800/0/1711486435718?e=2147483647&v=beta&t=DAqF2nK5hI9RV0D7EhVLX35ZLiAUMUA-Tuosq7WtCQ4",
      "attributions": {
        "title": "Shared by Kaitlyn Owen",
        "link": "https://www.linkedin.com/in/kaitlyn-owen?trk=public_profile_actor-name"
      }
    },
    {
      "title": "Reflecting on another amazing year! In 2023, I had the opportunity to work with so many incredible surgeons and hospitals, to help get healthcare to…",
      "link": "https://www.linkedin.com/posts/kaitlyn-owen_weatherbyhealthcare-chghealthcare-locum-activity-7146531265897234432-EzBj?trk=public_profile",
      "image": "https://media.licdn.com/dms/image/D4E22AQHtegaMfmHSfw/feedshare-shrink_2048_1536/0/1703865828105?e=2147483647&v=beta&t=6aTOKbxcyH4hgJswNj_WOvE9AxeUnASsnb6Kxv0ChPU",
      "attributions": {
        "title": "Shared by Kaitlyn Owen",
        "link": "https://www.linkedin.com/in/kaitlyn-owen?trk=public_profile_actor-name"
      }
    },
    {
      "title": "A great read for anyone who is currently doing, or has thought about doing, locum tenens work! Learn the ins and outs of finances while working…",
      "link": "https://www.linkedin.com/posts/kaitlyn-owen_what-to-know-about-locum-tenens-finances-activity-7140344345198501889-4uC4?trk=public_profile",
      "image": "https://media.licdn.com/dms/image/sync/D5627AQELglatDP2mXw/articleshare-shrink_1280_800/0/1711744807281?e=2147483647&v=beta&t=IMAyTPy3fSuf36q9PEvlBc31xbCrayyaAVeNa_Zs45g",
      "attributions": {
        "title": "Shared by Kaitlyn Owen",
        "link": "https://www.linkedin.com/in/kaitlyn-owen?trk=public_profile_actor-name"
      }
    }
  ],
  "experience": {
    "experienceTotal": 0,
    "experienceGroup": [],
    "experienceList": []
  },
  "education": [
    {
      "school": "",
      "link": "",
      "image": null,
      "degreeInfo": [],
      "startDate": "2014",
      "endDate": "2018"
    },
    {
      "school": "",
      "link": "",
      "image": null,
      "degreeInfo": [],
      "startDate": "2014",
      "endDate": "2015"
    }
  ],
  "publications": [],
  "patents": [],
  "volunteering": [],
  "certifications": [],
  "courses": [],
  "projects": [],
  "languages": [],
  "organizations": [],
  "groups": [],
  "recommendations": [
    {
      "text": "“I can highly recommend Kaitlyn from personal experience. She reached out to me as I was transitioning out of a long term surgical Practice. She was eager, vivacious , and persistent- all qualities which she continues to use to get me Locums jobs. She has been like my own personal concierge service at WEATHERBY. Her communication skills are fantastic- always reaching out, making sure things are in order, before, during And after an assignment. I have truly enjoyed working with her and look forward to an ongoing relationship .”"
    },
    {
      "text": "“It is with pleasure that I write this letter of recommendation for Kaitlyn Owen. First a little background is in order. I \"met\" Kaitlyn after she called me as Weatherby's representative and asked to connect me with hospitals in need of Locums coverage. We have never met in person but her personality, persistence, and her ability to \"connect\" on a meaningful level comes through wether it is on the phone, in text or e-mail. It is clear that she is organized and can coordinate multiple physicians and opportunities all at the same time. I have not encountered a problem that Kaitlyn was unable to address and solve. Without meeting her in person, she appears to be a genuine, warm, charming individual. I can recommend Kaitlyn without any reservation. I could go on but I believe that brevity keeps the message \"pure.\" In short I am very fortunate to have her as my Weatherby representative.”"
    }
  ],
  "awards": [],
  "peopleAlsoViewed": [
    {
      "title": "Michelle Bowdich",
      "position": "",
      "link": "https://www.linkedin.com/in/michellebowdich?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Christy K",
      "position": "",
      "link": "https://www.linkedin.com/in/christy-k-10826233?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Morgan McEldowney",
      "position": "",
      "link": "https://www.linkedin.com/in/morgan-mceldowney?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Lily Kholina",
      "position": "",
      "link": "https://www.linkedin.com/in/lily-kholina-81006b64?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Ainsley Rodriguez",
      "position": "",
      "link": "https://www.linkedin.com/in/ainsley-rodriguez-a50b3a145?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Brooke Gibson",
      "position": "",
      "link": "https://www.linkedin.com/in/brooke-gibson-348bb2140?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Brandi Talton",
      "position": "",
      "link": "https://www.linkedin.com/in/brandi-talton-653b46121?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Chelsea Donaldson",
      "position": "",
      "link": "https://www.linkedin.com/in/chelsea-donaldson-a343838a?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Constance Bailes",
      "position": "Public Relations and Marketing",
      "link": "https://www.linkedin.com/in/constance-bailes-6710a384?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Melissa Katcher",
      "position": "",
      "link": "https://www.linkedin.com/in/melissa-katcher-23700a88?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Megan Racer",
      "position": "",
      "link": "https://www.linkedin.com/in/megan-racer-a82720224?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Erika Glam",
      "position": "",
      "link": "https://www.linkedin.com/in/erika-glam-25060242?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Tashina Rickerson",
      "position": "",
      "link": "https://www.linkedin.com/in/tashina-rickerson-45304991?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Lauren LaDell",
      "position": "",
      "link": "https://www.linkedin.com/in/lauren-ladell?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Tara Teeter",
      "position": "Life Insurance Agent, Marketing Specialist, Event Planner",
      "link": "https://www.linkedin.com/in/tara-teeter-784b7926?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Casie Greene",
      "position": "",
      "link": "https://www.linkedin.com/in/casie-greene-7693bba0?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Kristin Kubrick",
      "position": "Sales Manager II at Weatherby Healthcare - Surgery Division",
      "link": "https://www.linkedin.com/in/kristin-kubrick-7bba03134?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Jillian Davis",
      "position": "",
      "link": "https://www.linkedin.com/in/aboutfacemodels?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Savannah Stow",
      "position": "Marketing Manager at Self Employed",
      "link": "https://www.linkedin.com/in/savannahreel?trk=public_profile_browsemap-profile",
      "image": null
    },
    {
      "title": "Wesley McQuaid",
      "position": "",
      "link": "https://www.linkedin.com/in/wesley-mcquaid?trk=public_profile_browsemap-profile",
      "image": null
    }
  ],
  "sameNamed": [
    {
      "title": "Kaitlyn Owen",
      "position": "Project Manager at IU Health Physicians",
      "link": "https://www.linkedin.com/in/kaitlynkolzow?trk=public_profile_samename-profile",
      "image": null,
      "location": "Indianapolis, IN"
    },
    {
      "title": "Kaitlyn Owen",
      "position": "Administrative Professional",
      "link": "https://www.linkedin.com/in/kaitlyn-owen-704b8b91?trk=public_profile_samename-profile",
      "image": null,
      "location": "Winston-Salem, NC"
    },
    {
      "title": "Kaitlyn Owen",
      "position": "Student at University of Illinois Urbana-Champaign",
      "link": "https://www.linkedin.com/in/kaitlyn-owen-bb9a46267?trk=public_profile_samename-profile",
      "image": null,
      "location": "McHenry, IL"
    },
    {
      "title": "Kaitlyn Owen",
      "position": "",
      "link": "https://www.linkedin.com/in/kaitlyn-owen-1a469575?trk=public_profile_samename-profile",
      "image": null,
      "location": "Redmond, WA"
    }
  ],
  "similarProfiles": []
}
Enter fullscreen mode Exit fullscreen mode

By following these steps, you can effectively scrape LinkedIn profiles using Crawlbase's API. Next, we’ll explore how to scrape LinkedIn company pages and feeds.

Crawlbase LinkedIn Company Pages Scraper

Next, let’s explore how to use Crawlbase's LinkedIn company pages scraper. This tool allows you to extract detailed information about companies listed on LinkedIn.

Scraping a LinkedIn Company Page

To scrape a LinkedIn company page, you'll need to set up a script similar to the one used for scraping profiles. Here’s how you can do it:

from crawlbase import CrawlingAPI
import json

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({ 'token': 'YOUR_API_TOKEN' })

URL = 'https://www.linkedin.com/company/amazon'

options = {
    'scraper': 'linkedin-company',
    'async': 'true'
}

# Function to make a request using Crawlbase API
def make_crawlbase_request(url):
    response = crawling_api.get(url, options)
    if response['status_code'] == 200:
        return json.loads(response['body'].decode('latin1'))
    else:
        print("Failed to fetch the page. Status code:", response['status_code'])
        return None

def scrape_company(url):
    try:
        json_response = make_crawlbase_request(url)
        if json_response:
            return json_response
    except Exception as e:
        print(f"Request failed: {e}")

    return None

if __name__ == '__main__':
    scraped_data = scrape_company(URL)
    print(json.dumps(scraped_data, indent=2))
Enter fullscreen mode Exit fullscreen mode

This script initializes the Crawlbase API, sets the URL of the LinkedIn company page you want to scrape, and specifies the linkedin-company scraper. The script then makes an asynchronous request to fetch the company data and prints the JSON response.

Example Output:

{
  "rid": "f270321bbebe203b43cebedd"
}
Enter fullscreen mode Exit fullscreen mode

Retrieving Data from Crawlbase Storage API

As with profile scraping, asynchronous requests will return a rid. You can use this rid to retrieve the stored data.

from crawlbase import StorageAPI
import json

# Initialize Crawlbase Storage API with your access token
storage_api = StorageAPI({ 'token': 'YOUR_API_TOKEN' })

RID = 'your_request_identifier'

# Function to retrieve data from Crawlbase storage
def retrieve_data(rid):
    response = storage_api.get(f'https://api.crawlbase.com/storage?rid={rid}')
    if response['status_code'] == 200:
        return json.loads(response['body'].decode('latin1'))
    else:
        print("Failed to retrieve the data. Status code:", response['status_code'])
        return None

if __name__ == '__main__':
    retrieved_data = retrieve_data(RID)
    print(json.dumps(retrieved_data, indent=2))
Enter fullscreen mode Exit fullscreen mode

This script retrieves and prints the stored company data using the rid.

Example Output:

{
  "title": "Amazon",
  "headline": "Software Development",
  "cover_image": "https://media.licdn.com/dms/image/D4D3DAQGri_YWxYb-GQ/image-scale_191_1128/0/1681945878609/amazon_cover?e=2147483647&v=beta&t=DEHImsFhQdlARMSTcY2AmdImxdLxIyvDncPmPQEpebY",
  "company_image": "https://media.licdn.com/dms/image/C560BAQHTvZwCx4p2Qg/company-logo_200_200/0/1630640869849/amazon_logo?e=2147483647&v=beta&t=2vRB20XZOYNtXSr5GHAUUQXXII4lvgcotA2QTMcRHOI",
  "url": "https://www.linkedin.com/company/amazon",
  "employees": {
    "numberOfEmployees": 737833,
    "link": "https://www.linkedin.com/search/results/people/?facetCurrentCompany=%5B15218805%2C+2649984%2C+17411%2C+78392228%2C+208137%2C+61712%2C+2382910%2C+49318%2C+16551%2C+80073065%2C+47157%2C+21433%2C+71099%2C+860467%2C+12227%2C+167364%2C+4787585%2C+11091426%2C+451028%2C+111446%2C+14951%2C+46825%2C+2320329%2C+34924%2C+1586%5D"
  },
  "followersCount": 31243559,
  "tagging": "",
  "description": "Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. We are driven by the excitement of building technologies, inventing products, and providing services that change lives. We embrace new ways of doing things, make decisions quickly, and are not afraid to fail. We have the scope and capabilities of a large company, and the spirit and heart of a small one. Together, Amazonians research and develop new technologies from Amazon Web Services to Alexa on behalf of our customers: shoppers, sellers, content creators, and developers around the world. Our mission is to be Earth's most customer-centric company. Our actions, goals, projects, programs, and inventions begin and end with the customer top of mind. You'll also hear us say that at Amazon, it's always \"Day 1.\" What do we mean? That our approach remains the same as it was on Amazon's very first day - to make smart, fast decisions, stay nimble, invent, and focus on delighting our customers.",
  "basicInfo": [
    {
      "name": "Website",
      "value": "https://www.aboutamazon.com/ External link for Amazon"
    },
    {
      "name": "Industry",
      "value": "Software Development"
    },
    {
      "name": "Company size",
      "value": "10,001+ employees"
    },
    {
      "name": "Headquarters",
      "value": "Seattle, WA"
    },
    {
      "name": "Type",
      "value": "Public Company"
    },
    {
      "name": "Specialties",
      "value": "e-Commerce, Retail, Operations, and Internet"
    }
  ],
  "locations": {
    "primary": {
      "address": "2127 7th Ave.Seattle, WA 98109, US",
      "link": "https://www.bing.com/maps?where=2127+7th+Ave.+Seattle+98109+WA+US&trk=org-locations_url"
    },
    "other": [
      {
        "address": "12900 Worldgate DrHerndon, VA 20170, US",
        "link": "https://www.bing.com/maps?where=12900+Worldgate+Dr+Herndon+20170+VA+US&trk=org-locations_url"
      },
      {
        "address": "7200 Discovery DrChattanooga, TN 37416, US",
        "link": "https://www.bing.com/maps?where=7200+Discovery+Dr+Chattanooga+37416+TN+US&trk=org-locations_url"
      },
      {
        "address": "1100 Enterprise WaySunnyvale, CA 94089, US",
        "link": "https://www.bing.com/maps?where=1100+Enterprise+Way+Sunnyvale+94089+CA+US&trk=org-locations_url"
      },
      {
        "address": "2010 Broening HwyBaltimore, MD 21224, US",
        "link": "https://www.bing.com/maps?where=2010+Broening+Hwy+Baltimore+21224+MD+US&trk=org-locations_url"
      },
      {
        "address": "Buyukdere Caddesi 185Istanbul, Istanbul 34394, TR",
        "link": "https://www.bing.com/maps?where=Buyukdere+Caddesi+185+Istanbul+34394+Istanbul+TR&trk=org-locations_url"
      },
      {
        "address": "Via de las Dos Castillas, 33Pozuelo de Alarcon, Community of Madrid 28224, ES",
        "link": "https://www.bing.com/maps?where=Via+de+las+Dos+Castillas,+33+Pozuelo+de+Alarcon+28224+Community+of+Madrid+ES&trk=org-locations_url&"
      },
      {
        "address": "Im GewerbeparkRegensburg, Bavaria 93059, DE",
        "link": "https://www.bing.com/maps?where=Im+Gewerbepark+Regensburg+93059+Bavaria+DE&trk=org-locations_url"
      },
      {
        "address": "8 Exhibition StMelbourne, VIC 3000, AU",
        "link": "https://www.bing.com/maps?where=8+Exhibition+St+Melbourne+3000+VIC+AU&trk=org-locations_url"
      },
      {
        "address": "705 Boulder DrBreinigsville, PA 18031, US",
        "link": "https://www.bing.com/maps?where=705+Boulder+Dr+Breinigsville+18031+PA+US&trk=org-locations_url"
      },
      {
        "address": "2700 Regent BlvdIrving, TX 75063, US",
        "link": "https://www.bing.com/maps?where=2700+Regent+Blvd+Irving+75063+TX+US&trk=org-locations_url"
      },
      {
        "address": "500 Kinetic DrHuntington, WV 25701, US",
        "link": "https://www.bing.com/maps?where=500+Kinetic+Dr+Huntington+25701+WV+US&trk=org-locations_url"
      },
      {
        "address": "1125 Remington BlvdRomeoville, IL 60446, US",
        "link": "https://www.bing.com/maps?where=1125+Remington+Blvd+Romeoville+60446+IL+US&trk=org-locations_url"
      },
      {
        "address": "Burlington RoadDublin, County Dublin, IE",
        "link": "https://www.bing.com/maps?where=Burlington+Road+Dublin+County+Dublin+IE&trk=org-locations_url"
      },
      {
        "address": "109 Braid StNew Westminster, BC V3L 5H4, CA",
        "link": "https://www.bing.com/maps?where=109+Braid+St+New+Westminster+V3L+5H4+BC+CA&trk=org-locations_url"
      },
      {
        "address": "Solan RdCape Town, Western Cape 8001, ZA",
        "link": "https://www.bing.com/maps?where=Solan+Rd+Cape+Town+8001+Western+Cape+ZA&trk=org-locations_url"
      },
      {
        "address": "2700 Center DrDupont, WA 98327, US",
        "link": "https://www.bing.com/maps?where=2700+Center+Dr+Dupont+98327+WA+US&trk=org-locations_url"
      },
      {
        "address": "8000 N Virginia StReno, NV 89506, US",
        "link": "https://www.bing.com/maps?where=8000+N+Virginia+St+Reno+89506+NV+US&trk=org-locations_url"
      },
      {
        "address": "4848 Perrin CreekSan Antonio, TX 78217, US",
        "link": "https://www.bing.com/maps?where=4848+Perrin+Creek+San+Antonio+78217+TX+US&trk=org-locations_url"
      },
      {
        "address": "1555 N Chrisman RdTracy, CA 95304, US",
        "link": "https://www.bing.com/maps?where=1555+N+Chrisman+Rd+Tracy+95304+CA+US&trk=org-locations_url"
      },
      {
        "address": "60 Holborn ViaductLondon, England EC1A 2FD, GB",
        "link": "https://www.bing.com/maps?where=60+Holborn+Viaduct+London+EC1A+2FD+England+GB&trk=org-locations_url"
      },
      {
        "address": "120 Bremner BlvdToronto, ON M5J 0A8, CA",
        "link": "https://www.bing.com/maps?where=120+Bremner+Blvd+Toronto+M5J+0A8+ON+CA&trk=org-locations_url"
      },
      {
        "address": "31 Rives de ClausenLuxembourg, Luxembourg 2165, LU",
        "link": "https://www.bing.com/maps?where=31+Rives+de+Clausen+Luxembourg+2165+Luxembourg+LU&trk=org-locations_url"
      },
      {
        "address": "Sunbank LaneAltrincham, England WA15 0, GB",
        "link": "https://www.bing.com/maps?where=Sunbank+Lane+Altrincham+WA15+0+England+GB&trk=org-locations_url"
      },
      {
        "address": "86 5th St NWAtlanta, GA 30308, US",
        "link": "https://www.bing.com/maps?where=86+5th+St+NW+Atlanta+30308+GA+US&trk=org-locations_url"
      },
      {
        "address": "402 John Dodd RdSpartanburg, SC 29303, US",
        "link": "https://www.bing.com/maps?where=402+John+Dodd+Rd+Spartanburg+29303+SC+US&trk=org-locations_url"
      },
      {
        "address": "Waterloo PlaceEdinburgh, Scotland EH1 3EG, GB",
        "link": "https://www.bing.com/maps?where=Waterloo+Place+Edinburgh+EH1+3EG+Scotland+GB&trk=org-locations_url"
      },
      {
        "address": "Am Brauhaus 12Dresden, SN 01099, DE",
        "link": "https://www.bing.com/maps?where=Am+Brauhaus+12+Dresden+01099+SN+DE&trk=org-locations_url"
      },
      {
        "address": "3501 120th AveKenosha, WI 53144, US",
        "link": "https://www.bing.com/maps?where=3501+120th+Ave+Kenosha+53144+WI+US&trk=org-locations_url"
      },
      {
        "address": "24208 San Michele RdMoreno Valley, CA 92551, US",
        "link": "https://www.bing.com/maps?where=24208+San+Michele+Rd+Moreno+Valley+92551+CA+US&trk=org-locations_url"
      },
      {
        "address": "Calle del Hierro, 21Madrid, Community of Madrid 28045, ES",
        "link": "https://www.bing.com/maps?where=Calle+del+Hierro,+21+Madrid+28045+Community+of+Madrid+ES&trk=org-locations_url&"
      },
      {
        "address": "50 Airways BlvdNashville, TN 37217, US",
        "link": "https://www.bing.com/maps?where=50+Airways+Blvd+Nashville+37217+TN+US&trk=org-locations_url"
      },
      {
        "address": "3350 Laurel Ridge AveRuskin, FL 33570, US",
        "link": "https://www.bing.com/maps?where=3350+Laurel+Ridge+Ave+Ruskin+33570+FL+US&trk=org-locations_url"
      },
      {
        "address": "4255 Anson BlvdWhitestown, IN 46075, US",
        "link": "https://www.bing.com/maps?where=4255+Anson+Blvd+Whitestown+46075+IN+US&trk=org-locations_url"
      },
      {
        "address": "2170 RT-27Edison, NJ 08817, US",
        "link": "https://www.bing.com/maps?where=2170+RT-27+Edison+08817+NJ+US&trk=org-locations_url"
      },
      {
        "address": "560 Merrimac AveMiddletown, DE 19709, US",
        "link": "https://www.bing.com/maps?where=560+Merrimac+Ave+Middletown+19709+DE+US&trk=org-locations_url"
      },
      {
        "address": "150 W Jefferson AveDetroit, MI 48226, US",
        "link": "https://www.bing.com/maps?where=150+W+Jefferson+Ave+Detroit+48226+MI+US&trk=org-locations_url"
      },
      {
        "address": "101 Main StCambridge, MA 02142, US",
        "link": "https://www.bing.com/maps?where=101+Main+St+Cambridge+02142+MA+US&trk=org-locations_url"
      },
      {
        "address": "1800 140th Ave ESumner, WA 98390, US",
        "link": "https://www.bing.com/maps?where=1800+140th+Ave+E+Sumner+98390+WA+US&trk=org-locations_url"
      },
      {
        "address": "5000 Commerce WayPetersburg, VA 23803, US",
        "link": "https://www.bing.com/maps?where=5000+Commerce+Way+Petersburg+23803+VA+US&trk=org-locations_url"
      },
      {
        "address": "50 New Canton WayRobbinsville Township, NJ 08691, US",
        "link": "https://www.bing.com/maps?where=50+New+Canton+Way+Robbinsville+Township+08691+NJ+US&trk=org-locations_url"
      },
      {
        "address": "12900 Pecan Park RdJacksonville, FL 32218, US",
        "link": "https://www.bing.com/maps?where=12900+Pecan+Park+Rd+Jacksonville+32218+FL+US&trk=org-locations_url"
      },
      {
        "address": "4400 12th Street ExtWest Columbia, SC 29172, US",
        "link": "https://www.bing.com/maps?where=4400+12th+Street+Ext+West+Columbia+29172+SC+US&trk=org-locations_url"
      },
      {
        "address": "2 Park StSydney, NSW 2000, AU",
        "link": "https://www.bing.com/maps?where=2+Park+St+Sydney+2000+NSW+AU&trk=org-locations_url"
      },
      {
        "address": "510 W Georgia StVancouver, BC V6B 0M3, CA",
        "link": "https://www.bing.com/maps?where=510+W+Georgia+St+Vancouver+V6B+0M3+BC+CA&trk=org-locations_url"
      },
      {
        "address": "7290 Investment DrNorth Charleston, SC 29418, US",
        "link": "https://www.bing.com/maps?where=7290+Investment+Dr+North+Charleston+29418+SC+US&trk=org-locations_url"
      },
      {
        "address": "11999 National Rd SWPataskala, OH 43062, US",
        "link": "https://www.bing.com/maps?where=11999+National+Rd+SW+Pataskala+43062+OH+US&trk=org-locations_url"
      },
      {
        "address": "6400 Avenue 6000Cork, County Cork T12 D292, IE",
        "link": "https://www.bing.com/maps?where=6400+Avenue+6000+Cork+T12+D292+County+Cork+IE&trk=org-locations_url"
      },
      {
        "address": "96 E San Fernando StSan Jose, CA 95113, US",
        "link": "https://www.bing.com/maps?where=96+E+San+Fernando+St+San+Jose+95113+CA+US&trk=org-locations_url"
      },
      {
        "address": "Namestie 1. maja 7286/18Bratislava, Bratislava 811 06, SK",
        "link": "https://www.bing.com/maps?where=Namestie+1.+maja+7286/18+Bratislava+811+06+Bratislava+SK&trk=org-locations_url"
      },
      {
        "address": "Rue de PlanqueLauwin-Planque, Hauts-de-France 59553, FR",
        "link": "https://www.bing.com/maps?where=Rue+de+Planque+Lauwin-Planque+59553+Hauts-de-France+FR&trk=org-locations_url"
      },
      {
        "address": "23 Church StSingapore, Singapore 049481, SG",
        "link": "https://www.bing.com/maps?where=23+Church+St+Singapore+049481+Singapore+SG&trk=org-locations_url"
      },
      {
        "address": "8120 Humble Westfield RdHumble, TX 77338, US",
        "link": "https://www.bing.com/maps?where=8120+Humble+Westfield+Rd+Humble+77338+TX+US&trk=org-locations_url"
      },
      {
        "address": "2996 Ramona AveSacramento, CA 95826, US",
        "link": "https://www.bing.com/maps?where=2996+Ramona+Ave+Sacramento+95826+CA+US&trk=org-locations_url"
      },
      {
        "address": "801 30 St NECalgary, AB T2A 5L7, CA",
        "link": "https://www.bing.com/maps?where=801+30+St+NE+Calgary+T2A+5L7+AB+CA&trk=org-locations_url"
      },
      {
        "address": "3610 NW Saint Helens RdPortland, OR 97210, US",
        "link": "https://www.bing.com/maps?where=3610+NW+Saint+Helens+Rd+Portland+97210+OR+US&trk=org-locations_url"
      },
      {
        "address": "Avenida Juan Salvador Agraz 73Cuajimalpa de Morelos, CDMX 05348, MX",
        "link": "https://www.bing.com/maps?where=Avenida+Juan+Salvador+Agraz+73+Cuajimalpa+de+Morelos+05348+CDMX+MX&trk=org-locations_url"
      },
      {
        "address": "8050 Heritage RdBrampton, ON L6Y 0C9, CA",
        "link": "https://www.bing.com/maps?where=8050+Heritage+Rd+Brampton+L6Y+0C9+ON+CA&trk=org-locations_url"
      },
      {
        "address": "Evropska 2758/11Prague, Prague 160 00, CZ",
        "link": "https://www.bing.com/maps?where=Evropska+2758/11+Prague+160+00+Prague+CZ&trk=org-locations_url"
      },
      {
        "address": "1910 E Central AveSan Bernardino, CA 92408, US",
        "link": "https://www.bing.com/maps?where=1910+E+Central+Ave+San+Bernardino+92408+CA+US&trk=org-locations_url"
      },
      {
        "address": "1414 S Council RdOklahoma City, OK 73128, US",
        "link": "https://www.bing.com/maps?where=1414+S+Council+Rd+Oklahoma+City+73128+OK+US&trk=org-locations_url"
      },
      {
        "address": "1401 E McCarty LnSan Marcos, TX 78666, US",
        "link": "https://www.bing.com/maps?where=1401+E+McCarty+Ln+San+Marcos+78666+TX+US&trk=org-locations_url"
      },
      {
        "address": "Habibullah RoadChennai, Tamil Nadu 600017, IN",
        "link": "https://www.bing.com/maps?where=Habibullah+Road+Chennai+600017+Tamil+Nadu+IN&trk=org-locations_url"
      },
      {
        "address": "188 Spear StSan Francisco, CA 94105, US",
        "link": "https://www.bing.com/maps?where=188+Spear+St+San+Francisco+94105+CA+US&trk=org-locations_url"
      },
      {
        "address": "Via delle MechanicaFara in Sabina, Laz. 02032, IT",
        "link": "https://www.bing.com/maps?where=Via+delle+Mechanica+Fara+in+Sabina+02032+Laz.+IT&trk=org-locations_url"
      },
      {
        "address": "2302 Marietta Blvd NWAtlanta, GA 30318, US",
        "link": "https://www.bing.com/maps?where=2302+Marietta+Blvd+NW+Atlanta+30318+GA+US&trk=org-locations_url"
      },
      {
        "address": "Lane CtSterling, VA 20166, US",
        "link": "https://www.bing.com/maps?where=Lane+Ct+Sterling+20166+VA+US&trk=org-locations_url"
      },
      {
        "address": "SapirHerzliya, Tel Aviv 46000, IL",
        "link": "https://www.bing.com/maps?where=Sapir+Herzliya+46000+Tel+Aviv+IL&trk=org-locations_url"
      },
      {
        "address": "462 Hazelwood Logistics Center DrHazelwood, MO 63042, US",
        "link": "https://www.bing.com/maps?where=462+Hazelwood+Logistics+Center+Dr+Hazelwood+63042+MO+US&trk=org-locations_url"
      },
      {
        "address": "390 Interlocken CrescentBroomfield, CO 80021, US",
        "link": "https://www.bing.com/maps?where=390+Interlocken+Crescent+Broomfield+80021+CO+US&trk=org-locations_url"
      },
      {
        "address": "10201 Torre AveCupertino, CA 95014, US",
        "link": "https://www.bing.com/maps?where=10201+Torre+Ave+Cupertino+95014+CA+US&trk=org-locations_url"
      },
      {
        "address": "700 Westport PkwyFort Worth, TX 76177, US",
        "link": "https://www.bing.com/maps?where=700+Westport+Pkwy+Fort+Worth+76177+TX+US&trk=org-locations_url"
      },
      {
        "address": "763 SE Kasota AveMinneapolis, MN 55414, US",
        "link": "https://www.bing.com/maps?where=763+SE+Kasota+Ave+Minneapolis+55414+MN+US&trk=org-locations_url"
      },
      {
        "address": "1850 Mercer RdLexington, KY 40511, US",
        "link": "https://www.bing.com/maps?where=1850+Mercer+Rd+Lexington+40511+KY+US&trk=org-locations_url"
      },
      {
        "address": "4411 W 2100 SWest Valley City, UT 84120, US",
        "link": "https://www.bing.com/maps?where=4411+W+2100+S+West+Valley+City+84120+UT+US&trk=org-locations_url"
      },
      {
        "address": "Carrer de l'Alta RibagorcaEl Prat de Llobregat, Catalonia 08820, ES",
        "link": "https://www.bing.com/maps?where=Carrer+de+l'Alta+Ribagorca+El+Prat+de+Llobregat+08820+Catalonia+ES&trk=org-locations_url&"
      },
      {
        "address": "11501 Alterra PkwyAustin, TX 78758, US",
        "link": "https://www.bing.com/maps?where=11501+Alterra+Pkwy+Austin+78758+TX+US&trk=org-locations_url"
      },
      {
        "address": "Sikanderpur FlyoverGurugram, HR 122008, IN",
        "link": "https://www.bing.com/maps?where=Sikanderpur+Flyover+Gurugram+122008+HR+IN&trk=org-locations_url"
      },
      {
        "address": "2277 Center Square RdLogan Township, NJ 08085, US",
        "link": "https://www.bing.com/maps?where=2277+Center+Square+Rd+Logan+Township+08085+NJ+US&trk=org-locations_url"
      },
      {
        "address": "Marcel-Breuer-Straße 12Munich, Bavaria 80807, DE",
        "link": "https://www.bing.com/maps?where=Marcel-Breuer-Stra%C3%9Fe+12+Munich+80807+Bavaria+DE&trk=org-locations_url"
      }
    ]
  },
  "employeesAtCompany": [
    {
      "title": "Steven Hatch",
      "position": "Experienced Amazon Engineering Leader | Generative AI at Amazon",
      "link": "https://www.linkedin.com/in/hatch?trk=org-employees",
      "image": "https://media.licdn.com/dms/image/D4E03AQG823Q38d3Igg/profile-displayphoto-shrink_100_100/0/1673281011530?e=2147483647&v=beta&t=sK2PKC8tMDWU5koa0DpKxZzhQ1Zofs1shi941xNscrQ",
      "location": ""
    },
    {
      "title": "Brendon Wilson",
      "position": "Product Management Leader | Voice | Cloud | AI",
      "link": "https://www.linkedin.com/in/brendonwilson?trk=org-employees",
      "image": "https://media.licdn.com/dms/image/C5603AQGpn-EXgHDXiQ/profile-displayphoto-shrink_100_100/0/1526444059773?e=2147483647&v=beta&t=hfK-dOJtTnoAHYmsP53HQl7n9rewgM8_EpzZYwW93cs",
      "location": ""
    },
    {
      "title": "Kara H. Hurst",
      "position": "Chief Sustainability Officer, Amazon",
      "link": "https://www.linkedin.com/in/karahhurst?trk=org-employees",
      "image": "https://media.licdn.com/dms/image/D5603AQFpYGVopejk6g/profile-displayphoto-shrink_100_100/0/1700153802278?e=2147483647&v=beta&t=exoaVmbqrMPy9xjau_dj9x4xgRNhFVoZfDc_WFbi2j8",
      "location": ""
    },
    {
      "title": "John Combs",
      "position": "Business & Corporate Development at Amazon",
      "link": "https://www.linkedin.com/in/johnmcombs?trk=org-employees",
      "image": "https://media.licdn.com/dms/image/C4E03AQEMAiAH3Qu03Q/profile-displayphoto-shrink_100_100/0/1516155765577?e=2147483647&v=beta&t=FhQvl_SXSxTTO6ZQt-Hb-BXzqOAYJpqdnZ3tcPkaI_w",
      "location": ""
    }
  ],
  "updates": [
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "7h",
      "text": "Looking to boost your AI skills? 💥 Research shows that professionals with strong AI skills can earn higher salaries – up to 47% higher in IT, 43% higher in sales and marketing, and 42% higher in finance. Amazon Web Services (AWS) has you covered through two new AWS Certifications – one on AI foundations, and one for machine learning. Here’s the breakdown.⬇️ 1️⃣ AWS Certified AI Practitioner: This one's not just for techies. If you work in a field like marketing, sales, finance, or HR, you can increase your knowledge about AI and Gen AI concepts while learning how to sniff out opportunities to use AI tools in the workplace. 2️⃣ AWS Certified Machine Learning Engineer – Associate: This one's designed for people with slightly more ML experience. This certification is for you if you want to validate that you can build, deploy, and maintain AI models for real-time use. Whether you’re a student beginning to explore a career in AI, or a professional looking to get ahead, these new certifications can help you stay on the cutting edge. We're curious to know: are you interested in boosting your AI skills? 📕 💡 Learn more: https://amzn.to/3RnMxCw",
      "media": [],
      "reactionsCount": 291,
      "commentsCount": 39,
      "textLinks": [
        "https://www.linkedin.com/company/amazon-web-services?trk=organization_guest_main-feed-card-text",
        "https://amzn.to/3RnMxCw?trk=organization_guest_main-feed-card-text"
      ],
      "textTags": []
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "1d",
      "text": "Watch below as Amazon leaders share their best pieces of career advice. In this compilation from our Meet the Leader series, they answered some hard-hitting questions – including Star Wars vs. Star Trek. What are your best leadership tips? ⭐ Drop them in the comments below. ⬇️ Learn more here: https://amzn.to/3xfURO0",
      "media": [],
      "reactionsCount": 747,
      "commentsCount": 51,
      "textLinks": ["https://amzn.to/3xfURO0?trk=organization_guest_main-feed-card-text"],
      "textTags": []
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "3d",
      "text": "🗽 Step into history at our newest New York office. Originally one of the first department stores in the U.S., we restored this iconic Lord & Taylor NYC landmark to its roots with a modern twist.",
      "media": [],
      "reactionsCount": 2870,
      "commentsCount": 130,
      "textLinks": [],
      "textTags": []
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "1w",
      "text": "🏋️♂️ Meet Amazon's very own strongman! 🏋️♂️ This weekend, 27-year-old Luke Sperduti from Bristol, England will be competing for the title of UK’s Strongest Man. 💪 Since joining Amazon in 2020, Luke has risen to the role of Operations Supervisor and will soon take on a new challenge. His journey to strength started in the Corps of Royal Engineers, where he developed his passion for powerlifting. Fuelled by an impressive diet, Luke's daily intake includes porridge, tortellini pasta, and 4-5 meals a day, totaling around 6000 calories. 🍽️ Everyone wish Luke good luck ahead of his competition! Go, Luke!",
      "media": [],
      "reactionsCount": 4129,
      "commentsCount": 171,
      "textLinks": [],
      "textTags": []
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "1w",
      "text": "Love who you want to love. Be who you want to be. Here's to equality. Here's to Pride. 🏳️🌈 🏳️⚧️",
      "media": [],
      "reactionsCount": 2934,
      "commentsCount": 152,
      "textLinks": [],
      "textTags": []
    },
    {
      "actor": "Matt Garman",
      "actorLink": "https://www.linkedin.com/in/mattgarman?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "1w",
      "text": "Amazon reposted thisSharing a note I sent to all AWS employees today: Team, Over the past 18 years, I've had the privilege of working alongside the most talented, innovative, and customer-obsessed people on the planet. The journey has been nothing short of amazing, and today, I'm incredibly excited to mark Day 1 as CEO of AWS. From the very beginning, we’ve been driven by a commitment to deliver innovative products and services that solve real problems for our customers, and to anticipate ones they haven’t encountered yet. I love how this customer obsession allows us to tackle what sometimes seems impossible—and that relentless focus is still at our core today. We remain intent on providing secure, high-performing, sustainable, and operationally excellent cloud infrastructure and services that customers and partners can trust with their most precious data and workloads. The advances we're seeing in generative AI present one of the most exciting technological opportunities of our lifetimes, and thanks to all of you, we are helping tens of thousands of customers across every industry move quickly with this technology and change the way they work. As we continue expanding our array of building blocks to help customers take advantage of new technologies, we’re also continually growing our infrastructure around the world to help them securely run their mission-critical workloads. AWS has always been a place where smart risk-taking, customer obsession, and a never-ending drive to innovate are embraced and celebrated. With that foundation, there’s massive opportunity ahead. I’m looking forward to building this next chapter together. Matt",
      "media": [],
      "reactionsCount": 18587,
      "commentsCount": 548,
      "textLinks": [
        "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-reaction-header"
      ],
      "textTags": []
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "1w",
      "text": "Great to be named one of TIME magazine’s 100 most influential companies for 2024. TIME’s annual list reflects companies making extraordinary impacts worldwide – for us that includes our investments in #AI and expansion into South Africa. The selection process involved nominations across sectors, followed by rigorous evaluation by TIME’s editors on key criteria such as impact, innovation, ambition, and success. 🙏 Full article here: https://amzn.to/3VaN2RA",
      "media": [],
      "reactionsCount": 1596,
      "commentsCount": 123,
      "textLinks": [
        "https://www.linkedin.com/company/time?trk=organization_guest_main-feed-card-text",
        "https://amzn.to/3VaN2RA?trk=organization_guest_main-feed-card-text"
      ],
      "textTags": [
        {
          "hashtag": "#AI",
          "link": "https://www.linkedin.com/signup?session_redirect=https://www.linkedin.com/feed/hashtag/ai&trk=organization_guest_main-feed-card-text"
        }
      ]
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "2w",
      "text": "Happiness is seeing these faces every day at work! 🐶",
      "media": [],
      "reactionsCount": 29531,
      "commentsCount": 653,
      "textLinks": [],
      "textTags": []
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "2w",
      "text": "At a time when rapid responses to natural disasters are essential, we have taken a decisive step: Our new disaster relief base in Rheinberg, Germany, near Düsseldorf, is now operational. 🚀 🇩🇪 We have 13 Disaster Relief Hubs that span across Germany, Australia, India, Japan, and the United States. These bases allow us to respond efficiently to emergencies like floods, fires, and earthquakes by leveraging our global logistics network to quickly deliver relief supplies. Our items in stock include tents, blankets, camp beds, mats, sleeping bags, and hygiene kits with soap, toothbrushes, and toothpaste. Our data analysis confirms that over 80% of the items required in the event of a disaster are always the same, which underlines our preparedness and efficiency. We work closely with national and international aid organizations such as Deutsches Rotes Kreuz, Save the Children Deutschland, and IOM - UN Migration to meet their needs and procure the products they need in advance. 👀 Thanks to our teams, and the NGO partners for working to build this together. 👏 ⛑️",
      "media": [],
      "reactionsCount": 2350,
      "commentsCount": 117,
      "textLinks": [
        "https://de.linkedin.com/company/deutschesroteskreuz?trk=organization_guest_main-feed-card-text",
        "https://de.linkedin.com/company/save-the-children-deutschland?trk=organization_guest_main-feed-card-text",
        "https://ch.linkedin.com/company/iom?trk=organization_guest_main-feed-card-text"
      ],
      "textTags": []
    },
    {
      "actor": "Amazon",
      "actorLink": "https://www.linkedin.com/company/amazon?trk=organization_guest_main-feed-card_feed-actor-name",
      "postDate": "2w Edited",
      "text": "\"I think an embarrassing amount of how well you do [in your career] has to do with attitude. Do you work hard? Are you more can-do than nay-saying? Do you show up on time? Do you do what you said you were going to do? Can you work on a team? Those things seem so simple, and there’s so many things you can’t control in your work life, but you can control your attitude.\" Our CEO, Andy Jassy, sat down for an exclusive interview with LinkedIn’s CEO, Ryan Roslansky, to talk about his unique career journey, including his top 3 pieces of career advice. You definitely want to check it out! ⬇️ What career advice would you give to someone?",
      "media": [],
      "reactionsCount": 2409,
      "commentsCount": 136,
      "textLinks": [
        "https://www.linkedin.com/in/andy-jassy-8b1615?trk=organization_guest_main-feed-card-text",
        "https://www.linkedin.com/company/linkedin?trk=organization_guest_main-feed-card-text",
        "https://www.linkedin.com/in/ryanroslansky?trk=organization_guest_main-feed-card-text"
      ],
      "textTags": []
    }
  ],
  "affliatedPages": [],
  "similarPages": [
    {
      "title": "Google",
      "subtitle": "Software Development",
      "location": "Mountain View, CA",
      "link": "https://www.linkedin.com/company/google?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C4D0BAQHiNSL4Or29cg/company-logo_100_100/0/1631311446380?e=2147483647&v=beta&t=5bmvSDVt4i-ECxTU43yiS4iXUM4inJiG-e9PHOUlxx0"
    },
    {
      "title": "Microsoft",
      "subtitle": "Software Development",
      "location": "Redmond, Washington",
      "link": "https://www.linkedin.com/company/microsoft?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C560BAQE88xCsONDULQ/company-logo_100_100/0/1630652622688/microsoft_logo?e=2147483647&v=beta&t=4ft1hh_UdO2TMuqRWlFPHTTr2B3BN0E2LmTE6tEYwJI"
    },
    {
      "title": "Apple",
      "subtitle": "Computers and Electronics Manufacturing",
      "location": "Cupertino, California",
      "link": "https://www.linkedin.com/company/apple?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C560BAQHdAaarsO-eyA/company-logo_100_100/0/1630637844948/apple_logo?e=2147483647&v=beta&t=9XgJ_AXIJiidixRVc0ZwJj-822U17Q2mbkNSPpTqbXg"
    },
    {
      "title": "Deloitte",
      "subtitle": "Business Consulting and Services",
      "location": "",
      "link": "https://www.linkedin.com/company/deloitte?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C560BAQGNtpblgQpJoQ/company-logo_100_100/0/1662120928214/deloitte_logo?e=2147483647&v=beta&t=KhIfaHWyu1aAgyyImEhYDprMjFP3LaMR0E7NF2MPxMY"
    },
    {
      "title": "Netflix",
      "subtitle": "Entertainment Providers",
      "location": "Los Gatos, CA",
      "link": "https://www.linkedin.com/company/netflix?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C4E0BAQEVb0ZISWk8vQ/company-logo_100_100/0/1631355051964?e=2147483647&v=beta&t=_82G5gJfq-rmofKHPHZOMBYvtHfTF8Z2qA_zAUvcVV4"
    },
    {
      "title": "IBM",
      "subtitle": "IT Services and IT Consulting",
      "location": "Armonk, New York, NY",
      "link": "https://www.linkedin.com/company/ibm?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/D560BAQGiz5ecgpCtkA/company-logo_100_100/0/1688684715866/ibm_logo?e=2147483647&v=beta&t=5zkuzxYrW1Iyx8oUa-u7lMSQ9TN1Q9D87M_0ybQf3NQ"
    },
    {
      "title": "Meta",
      "subtitle": "Software Development",
      "location": "Menlo Park, CA",
      "link": "https://www.linkedin.com/company/meta?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C4E0BAQFdNatYGiBelg/company-logo_100_100/0/1636138754252/facebook_logo?e=2147483647&v=beta&t=ULaTUKRgzMzLCy5-pLoRMfMKpEI4OApXM5C9pEDZSDs"
    },
    {
      "title": "Flipkart",
      "subtitle": "Technology, Information and Internet",
      "location": "Bangalore, Karnataka",
      "link": "https://in.linkedin.com/company/flipkart?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C560BAQF6H8gAs-JyFg/company-logo_100_100/0/1630669478258/flipkart_logo?e=2147483647&v=beta&t=AfdreZVmMDcWw7rYTg7ythrTwdm4yKU2gYlM90Stnd0"
    },
    {
      "title": "Amazon Web Services (AWS)",
      "subtitle": "IT Services and IT Consulting",
      "location": "Seattle, WA",
      "link": "https://www.linkedin.com/company/amazon-web-services?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/C560BAQER_QnUTXrPJw/company-logo_100_100/0/1670264051233/amazon_web_services_logo?e=2147483647&v=beta&t=tI5mZm2XR_yMnLD5LQNmk8dQtVwGevKFXUHJlb8I_wE"
    },
    {
      "title": "Tata Consultancy Services",
      "subtitle": "IT Services and IT Consulting",
      "location": "Mumbai, Maharashtra",
      "link": "https://in.linkedin.com/company/tata-consultancy-services?trk=similar-pages",
      "image": "https://media.licdn.com/dms/image/D4D0BAQGsGR9p4ikS5w/company-logo_100_100/0/1708946550425/tata_consultancy_services_logo?e=2147483647&v=beta&t=jw02JCmA90t0qWePW3z8_xCTUrKd51xsWMD7K3Uqtzc"
    }
  ],
  "funding": {
    "basicInfo": {
      "name": "Amazon",
      "rounds": "3 total rounds",
      "link": "https://www.crunchbase.com/organization/amazon/funding_rounds/funding_rounds_list?utm_source=linkedin&utm_medium=referral&utm_campaign=linkedin_companies&utm_content=all_fundings_anon&trk=funding_all-rounds"
    },
    "lastRound": {
      "title": "Post IPO debt",
      "type": "",
      "date": "Feb 3, 2023",
      "link": "",
      "money": "US$ 8.0B"
    },
    "investors": []
  },
  "stock": {
    "symbol": "",
    "date": "",
    "data": {
      "symbol": null,
      "delayed": null
    },
    "price": "",
    "priceChange": "",
    "priceDaily": {},
    "dataSrouce": "Data from Refinitiv"
  },
  "products": []
}
Enter fullscreen mode Exit fullscreen mode

By following these steps, you can efficiently scrape LinkedIn company pages using Crawlbase's API. In the next section, we'll cover how to scrape LinkedIn feeds.

Crawlbase LinkedIn Feeds Scraper

Finally, let's explore how to use Crawlbase's LinkedIn feeds scraper to extract valuable data from LinkedIn feeds.

Scraping a LinkedIn Feed

To scrape a LinkedIn feed, you'll follow a similar process to scraping profiles and company pages. Here’s how you can do it:

from crawlbase import CrawlingAPI
import json

# Initialize Crawlbase API with your access token
crawling_api = CrawlingAPI({ 'token': 'YOUR_API_TOKEN' })

URL = 'https://www.linkedin.com/feed/update/urn:li:activity:7022155503770251267'

options = {
    'scraper': 'linkedin-feed',
    'async': 'true'
}

# Function to make a request using Crawlbase API
def make_crawlbase_request(url):
    response = crawling_api.get(url, options)
    if response['status_code'] == 200:
        return json.loads(response['body'].decode('latin1'))
    else:
        print("Failed to fetch the page. Status code:", response['status_code'])
        return None

def scrape_feed(url):
    try:
        json_response = make_crawlbase_request(url)
        if json_response:
            return json_response
    except Exception as e:
        print(f"Request failed: {e}")

    return None

if __name__ == '__main__':
    scraped_data = scrape_feed(URL)
    print(json.dumps(scraped_data, indent=2))
Enter fullscreen mode Exit fullscreen mode

This script initializes the Crawlbase API, sets the URL of the LinkedIn feed you want to scrape, and specifies the linkedin-feed scraper. The script then makes an asynchronous request to fetch the feed data and prints the JSON response.

Example Output:

{
  "rid": "977b3381ab11f938d6522775"
}
Enter fullscreen mode Exit fullscreen mode

Retrieving Data from Crawlbase Storage API

As with profile and company page scraping, asynchronous requests will return a rid. You can use this rid to retrieve the stored data.

from crawlbase import StorageAPI
import json

# Initialize Crawlbase Storage API with your access token
storage_api = StorageAPI({ 'token': 'YOUR_API_TOKEN' })

RID = 'your_request_identifier'

# Function to retrieve data from Crawlbase storage
def retrieve_data(rid):
    response = storage_api.get(f'https://api.crawlbase.com/storage?rid={rid}')
    if response['status_code'] == 200:
        return json.loads(response['body'].decode('latin1'))
    else:
        print("Failed to retrieve the data. Status code:", response['status_code'])
        return None

if __name__ == '__main__':
    retrieved_data = retrieve_data(RID)
    print(json.dumps(retrieved_data, indent=2))
Enter fullscreen mode Exit fullscreen mode

This script retrieves and prints the stored feed data using the rid.

Example Output:

{
  "feeds": [
    {
      "text": "#AlphabetInc is eliminating 12,000 jobs, its chief executive said in a staff memo The cuts mark the latest to shake the #technology sector and come days after rival Microsoft Corp said it would lay off 10,000 workers. Full report - https://lnkd.in/dfxXc2N4",
      "images": [
        "https://media.licdn.com/dms/image/C4D22AQHvTzTp5mnMcg/feedshare-shrink_2048_1536/0/1674212335928?e=2147483647&v=beta&t=Aq3WKkxF1Q5ZwGB6ax6OOWRtCW7Vlz8KDdpBvvK4K_0"
      ],
      "videos": [],
      "datetime": "1y",
      "postUrl": "https://in.linkedin.com/company/hindustantimes?trk=public_post_feed-actor-image",
      "userName": "Hindustan Times",
      "reactionCount": 1177,
      "commentsCount": 13,
      "links": [
        {
          "text": "#AlphabetInc",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Falphabetinc&trk=public_post-text"
        },
        {
          "text": "#technology",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Ftechnology&trk=public_post-text"
        },
        {
          "text": "https://lnkd.in/dfxXc2N4",
          "url": "https://lnkd.in/dfxXc2N4?trk=public_post-text"
        }
      ],
      "comments": [
        {
          "userName": "achuthananadan jeevandas",
          "profile": "https://in.linkedin.com/in/achuthananadan-jeevandas-861265181?trk=public_post_comment_actor-name",
          "headline": "LET US GO FORWARD FROM PHYSICAL WORLD TO VIRTUAL REALITY WORLD AND CONQUER A SPACE IN METAVERSE -EDUCATE,EMPOWER,ENGAGE WEB3.0 NOW IN INTERNET JOINNING HANDS WITH METAVERSE PASSION PEOPLE",
          "text": "We all know he is finding very hard to take this decision.we all should support him in fhis moment."
        },
        {
          "userName": "Arpit Saxena",
          "profile": "https://in.linkedin.com/in/arpit-saxena-074266135?trk=public_post_comment_actor-name",
          "headline": "\"Engineer in finance—where every transaction is a step towards. Trying merging tech and tradition in banking, ensuring seamless fiscal sustainability.\" Cost effectively easy. Founder@unity_fintech",
          "text": "They will create a new company comparable to google 🤣🤣🤣🤣 We all are responsible to our karma 🙂"
        },
        {
          "userName": "chandan chhavi",
          "profile": "https://ae.linkedin.com/in/chandan-chhavi-62b49a18?trk=public_post_comment_actor-name",
          "headline": "Drive Safely",
          "text": "In lockdown period all have put on fat even companies. Now they are shedding their fat."
        },
        {
          "userName": "Aravindan A.R",
          "profile": "https://in.linkedin.com/in/aravindanparashar?trk=public_post_comment_actor-name",
          "headline": "Retail professional with 18+ years of experience",
          "text": "God save the jobs. If large MNC's layoff people, what about smaller companies. Inflation is around"
        },
        {
          "userName": "Asish Bishoi",
          "profile": "https://in.linkedin.com/in/asishbishoi?trk=public_post_comment_actor-name",
          "headline": "Immediate Joiner | Systems Engineer | TCS | 1 x AWS Certified | Full Stack | MySQL | MongoDB |",
          "text": "I think layoffs already happened."
        },
        {
          "userName": "Kanika Chaudhary",
          "profile": "https://in.linkedin.com/in/kanikachaudhary25?trk=public_post_comment_actor-name",
          "headline": "HR Executive | Volunteer",
          "text": "This is surreal! All i can see are lay offs happening. Its a request to all companies to please find some other way out. #sad"
        },
        {
          "userName": "Richard C. Clark, MISM",
          "profile": "https://www.linkedin.com/in/rickclark1972?trk=public_post_comment_actor-name",
          "headline": "Manufacturing Technologist / (Captain, FA, USRA)",
          "text": "Not surprising. Executives will not sacrifice their profits to benefit others"
        },
        {
          "userName": "SUNJOY GUPTA",
          "profile": "https://in.linkedin.com/in/sunjoy-gupta-b50a7792?trk=public_post_comment_actor-name",
          "headline": "Business Head At Galaxy Tech",
          "text": "Wow that's the \"Show Stopper\""
        },
        {
          "userName": "JAIDEEP CHATTERJEE",
          "profile": "https://in.linkedin.com/in/jaideep-chatterjee-556ab433?trk=public_post_comment_actor-name",
          "headline": "Associate professor of mgmt studies, an examiner and author",
          "text": "Digitized life...."
        },
        {
          "userName": "Madhvi S.",
          "profile": "https://in.linkedin.com/in/madhvi-s-59459a63?trk=public_post_comment_actor-name",
          "headline": "Accounts Payable (P2P)",
          "text": "Worst situation"
        }
      ]
    },
    {
      "text": "The Moscow Exchange (MOEX) suspended trading in dollars and euros on June 12 after the #US announced a new raft of measures targeting #Russia's financial institutions.",
      "images": [],
      "videos": [
        {
          "poster": "https://media.licdn.com/dms/image/D5605AQEiJr5OjFFakg/feedshare-thumbnail_720_1280/0/1718281419437?e=2147483647&v=beta&t=expMXkSOdZC3b4J6CfjyQTCxJCWA3xfjTwOYtCvzwPs",
          "src": null,
          "duration": "0:00"
        }
      ],
      "datetime": "3h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_us-russia-activity-7207079298128568320-3lh5",
      "userName": "Hindustan Times",
      "reactionCount": 34,
      "commentsCount": "",
      "links": [
        {
          "text": "#US",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fus&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#Russia",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Frussia&trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "Prime Minister Narendra Modi departs for Italy. At the invitation of Italian PM Giorgia Meloni, PM Modi is travelling to Apulia, Italy to participate in G7 Outreach Summit on 14th June. The two leaders will have a bilateral meeting on the sidelines of the Summit. Track updates https://lnkd.in/fWuZP46",
      "images": [],
      "videos": [
        {
          "poster": "https://media.licdn.com/dms/image/D5605AQGmExZtcZia2A/videocover-high/0/1718300978958?e=2147483647&v=beta&t=pWo8ZjNAXsGWH996OgBfKHRcKwzhrlQcufznC9L0xeA",
          "src": null,
          "duration": "0:00"
        }
      ],
      "datetime": "4h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_prime-minister-narendra-modi-departs-for-activity-7207076689825136641-AjzN",
      "userName": "Hindustan Times",
      "reactionCount": 654,
      "commentsCount": 12,
      "links": [
        {
          "text": "https://lnkd.in/fWuZP46",
          "url": "https://lnkd.in/fWuZP46?trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "An apartment building in Kuwait, housing foreign labour workers, left 49 workers dead and at least 50 injured. Out of the 49 casualties, 41 labourers were Indian, confirmed officials. MoS KV Singh met Indians who were injured and reiterated that they were safe and receiving treatment. https://lnkd.in/gtGWVt4Y",
      "images": [
        "https://media.licdn.com/dms/image/D5622AQEl2jgA25s9nA/feedshare-shrink_2048_1536/0/1718300926772?e=2147483647&v=beta&t=7OD5_sQi6vGwAP8EGexEpZ8KrRFnr7grgEJy0Evcw8E",
        "https://media.licdn.com/dms/image/D5622AQF8u56P85vfRQ/feedshare-shrink_2048_1536/0/1718300923405?e=2147483647&v=beta&t=At_OjVduEnDGSTtwtGZxoxglZo21AnWRf0s4wp0lEYE",
        "https://media.licdn.com/dms/image/D5622AQFP5svgXO487A/feedshare-shrink_2048_1536/0/1718300924445?e=2147483647&v=beta&t=M6lvz_7tCJZLsXQaACxAt4DBFVSaoAtVbu2cQRJmbnk"
      ],
      "videos": [],
      "datetime": "4h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_an-apartment-building-in-kuwait-housing-activity-7207076458437959680-ZkdS",
      "userName": "Hindustan Times",
      "reactionCount": 230,
      "commentsCount": "",
      "links": [
        {
          "text": "https://lnkd.in/gtGWVt4Y",
          "url": "https://lnkd.in/gtGWVt4Y?trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "A water pipeline of the #Delhi Jal Board is seen bursting amid the #watercrisis in New Delhi, India. 📸Sanchit Khanna/ HT",
      "images": [
        "https://media.licdn.com/dms/image/D5622AQHT0XE7q9TNdQ/feedshare-shrink_2048_1536/0/1718281871877?e=2147483647&v=beta&t=zVWgjZuvzglA8Wqe7UHTl7dkFMO2FKj9lNtIkk6YyKo",
        "https://media.licdn.com/dms/image/D5622AQGtOF6ciCUBEA/feedshare-shrink_800/0/1718281872017?e=2147483647&v=beta&t=vlzP2jCvFz3ycf1AYVbtyBvBGdnmiGJ5IJrVCY4nDWk",
        "https://media.licdn.com/dms/image/D5622AQEQMgQ43_qkFw/feedshare-shrink_800/0/1718281871798?e=2147483647&v=beta&t=O4lFO4FCF-aLU8r9-m_iuGD25z3QH_bbP5YEB09GIx0"
      ],
      "videos": [],
      "datetime": "4h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_delhi-watercrisis-activity-7207071754979078144-DOFk",
      "userName": "Hindustan Times",
      "reactionCount": "",
      "commentsCount": "",
      "links": [
        {
          "text": "#Delhi",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fdelhi&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#watercrisis",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fwatercrisis&trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "#Hamas criticized #US State Secretary #AntonyBlinken for attributing the stalled ceasefire talks to the group.",
      "images": [],
      "videos": [
        {
          "poster": "https://media.licdn.com/dms/image/D5605AQHN2qtA8Jqh-g/feedshare-thumbnail_720_1280/0/1718281214449?e=2147483647&v=beta&t=WqfahQkQtNSt0AKZgn7cXP3HfMAJJ-RoFyWwCu42npY",
          "src": null,
          "duration": "0:00"
        }
      ],
      "datetime": "4h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_hamas-us-antonyblinken-activity-7207064191281623044-mxI5",
      "userName": "Hindustan Times",
      "reactionCount": 20,
      "commentsCount": "",
      "links": [
        {
          "text": "#Hamas",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fhamas&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#US",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fus&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#AntonyBlinken",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fantonyblinken&trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "Amid the ongoing #heatwave, an air conditioner blast triggered a massive fire in #Noida. Here’s what you should know if you reside in a multi-storey building Swipe to know more Details here: https://lnkd.in/gb-3-yeJ",
      "images": [],
      "videos": [],
      "datetime": "5h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_noida-blast-activity-7207056677081088001-Ybb7",
      "userName": "Hindustan Times",
      "reactionCount": 121,
      "commentsCount": 1,
      "links": [
        {
          "text": "#heatwave",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fheatwave&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#Noida",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fnoida&trk=public_post_main-feed-card-text"
        },
        {
          "text": "https://lnkd.in/gb-3-yeJ",
          "url": "https://lnkd.in/gb-3-yeJ?trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "PM #NarendraModi's first foreign trip in 3rd term: What's on #India's agenda at #G7Summit in #Italy?",
      "images": [],
      "videos": [
        {
          "poster": "https://media.licdn.com/dms/image/D5605AQF2GsUOSXLaYQ/feedshare-thumbnail_720_1280/0/1718280903846?e=2147483647&v=beta&t=Z_xZfBeMPUxEkznjV2LK-SDXgmBqQdq6WOKhJxoqemg",
          "src": null,
          "duration": "0:00"
        }
      ],
      "datetime": "5h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_narendramodi-india-g7summit-activity-7207049091053211649-S9DD",
      "userName": "Hindustan Times",
      "reactionCount": 385,
      "commentsCount": 4,
      "links": [
        {
          "text": "#NarendraModi",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fnarendramodi&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#India",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Findia&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#G7Summit",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fg7summit&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#Italy",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fitaly&trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "While sharing the screenshot of the #LinkedIn profile on X, a startup co-founder expressed that it is the “most absurd education history of all time”.",
      "images": [],
      "videos": [],
      "datetime": "6h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_thanos-of-linkedin-profile-with-oxford-activity-7207041555327537154-ADFp",
      "userName": "Hindustan Times",
      "reactionCount": 8,
      "commentsCount": "",
      "links": [
        {
          "text": "#LinkedIn",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Flinkedin&trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "#Israeli Defence Forces stepped up their offensive against the #Lebanon-based #Hezbollah militant group after it launched its biggest attack, targeting northern #Israel.",
      "images": [],
      "videos": [
        {
          "poster": "https://media.licdn.com/dms/image/D5605AQGLHPppL1TVrA/feedshare-thumbnail_720_1280/0/1718277003542?e=2147483647&v=beta&t=1OHZ_B4EpUxLUZ0B3-O61JFbQ-3Yk-RowxoGq4sq1S4",
          "src": null,
          "duration": "0:00"
        }
      ],
      "datetime": "6h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_israeli-lebanon-hezbollah-activity-7207033980276121601-vDvH",
      "userName": "Hindustan Times",
      "reactionCount": 106,
      "commentsCount": 3,
      "links": [
        {
          "text": "#Israeli",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fisraeli&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#Lebanon",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Flebanon&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#Hezbollah",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fhezbollah&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#Israel",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fisrael&trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    },
    {
      "text": "#InPics | Family members of the #UphaarCinema fire victims grieved on the 27th anniversary of the tragedy at the Uphaar Memorial, located in front of Uphaar Cinema in Green Park, #NewDelhi, India, on June 13, 2024. The tragedy occurred on June 13, 1997, claiming the lives of 59 people and injured over 100 due to suffocation in the ensuing stampede.",
      "images": [
        "https://media.licdn.com/dms/image/D5622AQFCxIdr0s44Cw/feedshare-shrink_2048_1536/0/1718276698660?e=2147483647&v=beta&t=h3pS2CDEkkkq8JtiZSHqpuY81gog0Etm6mvbpGlUTbA",
        "https://media.licdn.com/dms/image/D5622AQH6kFfQgpWJ3g/feedshare-shrink_2048_1536/0/1718276698022?e=2147483647&v=beta&t=pdtaCOfLsGRMq78IyXHwUfwypKICdPU8yJTy5HDaPvo",
        "https://media.licdn.com/dms/image/D5622AQFn45KxLLAo4w/feedshare-shrink_2048_1536/0/1718276698276?e=2147483647&v=beta&t=f3Ac2rjWwHFuTlsgBbnzp18eDzVAADWvmnKUWKlcR8M"
      ],
      "videos": [],
      "datetime": "7h",
      "postUrl": "https://www.linkedin.com/posts/hindustantimes_inpics-uphaarcinema-newdelhi-activity-7207026470001516544-U4I0",
      "userName": "Hindustan Times",
      "reactionCount": 208,
      "commentsCount": 3,
      "links": [
        {
          "text": "#InPics",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Finpics&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#UphaarCinema",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fuphaarcinema&trk=public_post_main-feed-card-text"
        },
        {
          "text": "#NewDelhi",
          "url": "https://www.linkedin.com/signup/cold-join?session_redirect=https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fhashtag%2Fnewdelhi&trk=public_post_main-feed-card-text"
        }
      ],
      "comments": []
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

By following these steps, you can effectively scrape LinkedIn feeds using Crawlbase's Crawling API.

Supercharge Your Career Goals with Crawlbase

Scraping LinkedIn data can provide valuable insights for various applications, from job market analysis to competitive research. Crawlbase automate the process of gathering LinkedIn data, enabling you to focus on analyzing and utilizing the information. Using Crawlbase's powerful Crawling API and Python, you can efficiently scrape LinkedIn profiles, company pages, and feeds.

If you're looking to expand your web scraping capabilities, consider exploring our following guides on scraping other important websites.

📜 How to Scrape Indeed Job Posts
📜 How to Scrape Emails from LinkedIn
📜 How to Scrape Airbnb
📜 How to Scrape Realtor.com
📜 How to Scrape Expedia

If you have any questions or feedback, our support team is always available to assist you on your web scraping journey. Happy Scraping!

Frequently Asked Questions (FAQs)

Q. Is scraping LinkedIn legal?

Scraping LinkedIn is legal as long as you do not violate LinkedIn's terms of service. It's important to review LinkedIn's policies and ensure that your scraping activities comply with legal and ethical guidelines. Always respect privacy and data protection laws, and consider using officially provided APIs when available.

Q. How to scrape LinkedIn?

To scrape LinkedIn, you can use Crawlbase's Crawling API. First, set up your Python environment and install the Crawlbase library. Choose the appropriate scraper for your needs (profile, company, or feed), and make asynchronous requests to gather data. Retrieve the data using the Crawlbase Storage API, which stores the response for easy access.

Q. What are the challenges in scraping LinkedIn?

Scraping LinkedIn involves several challenges. LinkedIn has strong anti-scraping measures that can block your activities. The dynamic nature of LinkedIn's content makes it difficult to extract data consistently. Additionally, you must ensure compliance with legal and ethical standards, as violating LinkedIn's terms of service can lead to account bans or legal action. Using a reliable tool like Crawlbase can help mitigate some of these challenges by providing robust scraping capabilities and adhering to best practices.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .