Artificial Intelligence (AI) - Internship

E

Euprime

Internship Remote INR 15,000 - 25,000 / stipend Posted 2 weeks ago
Application deadline has passed

I want to create a web agent/crawler that would help me with the following.

Think and act like a business developer that crawls the web for relevant

information to output very well qualified leads. The tool should then be able

to, based on the criteria we provided, prioritize or rank the probability of

these people wanting to work with us. 3D in-vitro models helping researchers

to design new therapies.

Stage 1: Identification (The Input)- It scans target profiles based on your

criteria (e.g., Director of Toxicology, Head of Preclinical Safety) from

LinkedIn Search or Sales Navigator.

Data Source: LinkedIn, PubMed (for authors of recent relevant papers),

and Conference attendee lists (e.g., SOT - Society of Toxicology).

Stage 2: Enrichment (The Data Gathering) - For each identified person, the

tool queries external databases to find

1\. Contact Info: Business email (e.g., @pfizer.com) and phone numbers.

2\. Location Data: It distinguishes between the Person's Location (e.g.,

remote in Colorado) and the Company HQ (e.g., Cambridge, MA).

Stage 3: Ranking (The Probability Engine) - It applies the "Propensity to Buy"

score to each person based on weighted criteria (detailed below).

The Probability Ranking Logic: To prioritize who wants to work with 3D-in

vitro models for therapies, the tool would assign a score (0-100) based on

these weighted signals.

\- Signal

\- Category Criteria Example Scoring

\- Weight

Title contains:

Role Fit: Toxicology, Safety, Hepatic, 3D - High(+30)

Company Intent: Company recently raised Series A/B funding (cash to spend),

High (+20)

Technographic: Company already uses similar tech (e.g., in vitro models),

Medium(+15) open to New Approach Methodologies (or NAMs), Medium(+10)

Medium

(+15)

Location Located in a Hub (Boston/Cambridge, Bay Area, Basel,

UK Golden Triangle)

Medium (+10)

Scientific

Intent

Published a paper on Title contains

(Drug-Induced Liver Injury) in

last 2 years

Very High

(+40)

Result: A list of 500 people is instantly re-ordered.

Junior Scientist at a non-funded

startup might score 15/100, while a Director of Safety Assessment at a Series

B

biotech in Cambridge, MA who just published on liver toxicity would score

95/100.

The Output: The Lead Generation Dashboard

The final output would be a dynamic, searchable table (exportable to

Excel/CSV) with the following columns:

Rank Probability Name Title Company Location HQ Email LinkedIn Action

Search Functionality: You can type Boston or Oncology to filter this list

instantly.

Location Split: The table explicitly shows if the person is remote (e.g.,

Living in

Texas, HQ in Boston), which helps BD decide when to call or if a physical

meeting is possible.

4\. How to Build/Deploy This

Since you want this eventually to be a there are two paths:

Use a tool like Clay or Apollo.io.

o Clay is specifically designed to do exactly this: You feed it a LinkedIn

search URL, it finds the emails, maps the locations, uses AI to read their

company description to see if they drug discovery, assigns a score,

and puts it all in a Google Sheet/Dashboard.

Path B: Custom Dashboard

o A simple web app (Python/Streamlit) that connects to the LinkedIn API (or

a provider like Proxycurl) to fetch data and renders the table exactly as

described above.

Patch C: You can use your own new methodology

what are some of the main types of websites that the web agent needs to search

through to procure the information to be efficient?

To identify, enrich, and rank the best leads, the web agent needs to

systematically harvest data from five specific categories of websites.

This approach is what differentiates a generic lead list from the high-

probability tool you are building.

1\. Professional Social Networks

This is your baseline for identification and employment data.

Primary Source: LinkedIn (and Sales Navigator).

Secondary Source: Xing (relevant since this technology has a strong European

footprint/HQ).

What the Agent looks for:

o Job Titles: Director of Toxicology, Head of Safety Assessment, VP

Preclinical

o Location: Distinguishing between Home (Remote) vs. Office

o Tenure: How long they have been in the role (new hires are often looking

to buy new tech).

2\. Scientific Publication Databases

Primary Sources: PubMed, Google Scholar, bioRxiv.

What the Agent looks for:

o Keywords: (Drug-Induced Liver Injury), 3D cell culture, Organ-

on-chip, Hepatic spheroids, Investigative Toxicology

o Recency: Papers published in the last 12-24 months (indicates active

research).

o Authorship: Focus on the Corresponding Author" (usually the PI or budget

holder) or the First Author (often the user/influencer).

Required Skills

Artificial intelligence (general)
Required
Data Science (general)
Required
Data Structures (general)
Required
Deep Learning (general)
Required
Machine Learning (general)
Required
Market research (general)
Required
Python (general)
Required

About Euprime

Euprime provides analytics software and statistical consulting. We use statistical modeling and data analysis techniques to solve business problems, which help companies and institutions create value for their clients and consumers.

Job Summary

Job Type: Internship
Location: Remote
Salary: INR 15,000 - 25,000 / stipend
Posted: January 9, 2026
Deadline: January 10, 2026

Ready to Apply?

Take the next step in your career journey. Join Euprime and make an impact.