Location: Onsite (Raleigh, North Carolina)
CTS/Lexis Nexis
JD
Senior Data Scientist
As a Senior Data Scientist at Cognizant, you will play a crucial role in engineering modern businesses to improve everyday life. You will be instrumental in driving new product development within a collaborative team environment, writing production code in both run-time and build-time environments. Your role will involve proposing and building data-driven solutions to address high-value customer problems. You will work with large-scale natural language datasets, including matter and contract repositories, invoice/legal spend data, and work management. Your contributions will be pivotal in prototyping new ideas and collaborating with other data scientists, product designers, data engineers, front-end developers, and expert legal data annotators. You will experience the dynamic culture of a start-up while leveraging the extensive resources of an established company.
An ideal candidate will possess a strong passion for advancing beyond Jupyter Notebooks and consistently delivering production-ready code each sprint..
RESPONSIBILITIES
• Develop and implement LLM-based applications tailored for in-house legal needs, ensuring they align with Cognizant's commitment to excellence and innovation
• Evaluate and maintain our data assets and training/evaluation datasets, ensuring they meet the highest standards of integrity and quality.
• Design and build pipelines for preprocessing, annotating, and managing legal document datasets, fostering a customer-centric mindset.
• Collaborate with legal experts to understand requirements and ensure models meet domain-specific needs, working as one team.
• Conduct experiments and evaluate model performance to drive continuous improvements, raising the bar in all deliverables.
• Evaluate AI/ML and GenAI outcomes, both human and automated, to ensure accuracy, reliability, and alignment with business objectives.
• Interface with other technical personnel or team members to finalize requirements, demonstrating ownership of outcomes.
• Work closely with other development team members to understand complex product requirements and translate them into software designs, ensuring ethical choices in all actions.
• Successfully implement development processes, coding best practices, and code reviews for production environments, embodying Cognizant's values in every task.
REQUIREMENTS
• Strong hands-on experience and foundations in machine learning, including dimensionality reduction, clustering, embeddings, and sequence classification algorithms.
• Experience with deep learning frameworks such as PyTorch, TensorFlow, and Hugging Face Transformers.
• Practical experience in Natural Language Processing methods and libraries such as spaCy, word2vec, TensorFlow, Keras, PyTorch, Flair, BERT.
• Practical experience with large language models, prompt engineering, fine-tuning, and benchmarking using frameworks such as LangChain and LlamaIndex.
• Strong Python background.
• Knowledge of AWS, GCP, Azure, or other cloud platforms.
• Understanding of data modeling principles and complex data models.
• Proficiency with relational and NoSQL databases as well as vector stores (e.g., Postgres, Elasticsearch/OpenSearch, ChromaDB).
• Knowledge of Scala, Spark, Ray, or other distributed computing systems is highly preferred.
• Knowledge of API development, containerization, and machine learning deployment is highly preferred.
Experience with ML Ops/AI Ops is highly preferred
Jobseeker
Recruiter