
CoPilot for Pharma Research
Replace weeks of complex journal analysis with secure, instant and searchable insights. Accelerating research and innovation
Problem
A leading pharmaceutical group aimed to improve the efficiency of its research lifecycle by establishing a centralised knowledge hub for its extensive collection of scientific journals.
However, the complexity of biochemistry journals—often filled with intricate charts and unstructured data tables—posed significant challenges. Reviewing and interpreting these journals required weeks of effort by research teams, creating bottlenecks in the discovery process.
Furthermore, retrieving specific insights from hundreds of journals to support active research proved time-consuming and inefficient, limiting the laboratory’s ability to accelerate innovation and make timely, data-driven decisions.
Solution
We implemented a multi-modal RAG system as a chatbot using GPT-4 Vision and GPT4o to automatically extract and summarize information from images, such as scientific charts and experimental data.
These summaries are processed with a text embedding model for efficient semantic search and retrieval. The chatbot also attaches the relevant pages, charts, and tables as references for researchers’ perusal and report making activity.
The system features a user-friendly interface for easy upload of PDFs and image-heavy documents, automatically storing them in a searchable database for quick insights. Administrators can manage uploads, monitor performance, and maintain data integrity via a dedicated admin page.
Recipe
LLM Framework:
LangChain
LLM Models:
GPT-4 Vision & GPT4o
Embedding Model:
text-embedding-3-large
Vector Store:
Pinecone
Image Store:
Google Cloud Storage
Session History Database:
PostgreSQL
Backend Framework:
FastAPI
Frontend:
React, HTML/CSS
