CV Parser API
A REST API that accepts PDF and DOCX CVs and parses them using LangChain + GPT-4. Output: clean JSON with name, contact info, work history, education, and skills.
Category
Artificial Intelligence
Year
2023
Role
Backend Developer
Status
Completed
Project preview
Problem
CV parsing modules in HR software are both expensive and low-performing on Turkish-language CV formats.
Solution
Custom prompt templates combined with LangChain's structured output feature parse both English and Turkish CVs with 94% accuracy. Parse results are cached in Redis.
Outcomes
94% parse accuracy across Turkish and English CVs
Average processing time: 2.3 seconds
2 production integrations with HR software
Technical Challenges
Robust text extraction across wildly different CV layouts
Type-safe parsing using LangChain structured output
Cost optimization with a Redis TTL caching strategy
Tech Stack
FastAPI
API framework
LangChain
LLM orchestration
GPT-4
Parse engine
PyMuPDF
PDF text extraction
python-docx
Word file reading
Redis
Caching