Sr. Data Scientist 100
  • SUN-IT SOLUTIONS INC
4 Days Ago
60-65 per Hourly
NA
Raleigh-NC
11-44 Years
Required Skills: Dimensionality reduction, Clustering, Embeddings, Sequence classification, PyTorch TensorFlow Keras Hugging Face Transformers, spaCy, Flair, word2vec BERT, GPT, other Transformer models LangChain, LlamaIndex, Prompt engineering Fine-tuning & benchmarking Retrieval-Augmented Generation (RAG), MLOps / AI Ops (Preferred) Model deployment and lifecycle management ML pipeline automation, API development for model inference Containerization, AWS, GCP, Azure, Distributed Computing & Big Data, Apache Spark Ray Scala, Relational, NoSQL, Vector stores, Python, Scala, Bash scripting, Data modeling and schema design Working with complex and large-scale data structures Feature engineering and data preprocessing pipelines
Job Description
Sr Data Scientist

Location: Onsite (Raleigh, North Carolina)

CTS/Lexis Nexis

 

 

JD

Senior Data Scientist

As a Senior Data Scientist at Cognizant, you will play a crucial role in engineering modern businesses to improve everyday life. You will be instrumental in driving new product development within a collaborative team environment, writing production code in both run-time and build-time environments. Your role will involve proposing and building data-driven solutions to address high-value customer problems. You will work with large-scale natural language datasets, including matter and contract repositories, invoice/legal spend data, and work management. Your contributions will be pivotal in prototyping new ideas and collaborating with other data scientists, product designers, data engineers, front-end developers, and expert legal data annotators. You will experience the dynamic culture of a start-up while leveraging the extensive resources of an established company.

An ideal candidate will possess a strong passion for advancing beyond Jupyter Notebooks and consistently delivering production-ready code each sprint..

 

RESPONSIBILITIES

• Develop and implement LLM-based applications tailored for in-house legal needs, ensuring they align with Cognizant's commitment to excellence and innovation

• Evaluate and maintain our data assets and training/evaluation datasets, ensuring they meet the highest standards of integrity and quality.

• Design and build pipelines for preprocessing, annotating, and managing legal document datasets, fostering a customer-centric mindset.

• Collaborate with legal experts to understand requirements and ensure models meet domain-specific needs, working as one team.

• Conduct experiments and evaluate model performance to drive continuous improvements, raising the bar in all deliverables.

• Evaluate AI/ML and GenAI outcomes, both human and automated, to ensure accuracy, reliability, and alignment with business objectives.

• Interface with other technical personnel or team members to finalize requirements, demonstrating ownership of outcomes.

• Work closely with other development team members to understand complex product requirements and translate them into software designs, ensuring ethical choices in all actions.

• Successfully implement development processes, coding best practices, and code reviews for production environments, embodying Cognizant's values in every task.

 

REQUIREMENTS

• Strong hands-on experience and foundations in machine learning, including dimensionality reduction, clustering, embeddings, and sequence classification algorithms.

• Experience with deep learning frameworks such as PyTorch, TensorFlow, and Hugging Face Transformers.

• Practical experience in Natural Language Processing methods and libraries such as spaCy, word2vec, TensorFlow, Keras, PyTorch, Flair, BERT.

• Practical experience with large language models, prompt engineering, fine-tuning, and benchmarking using frameworks such as LangChain and LlamaIndex.

• Strong Python background.

• Knowledge of AWS, GCP, Azure, or other cloud platforms.

• Understanding of data modeling principles and complex data models.

• Proficiency with relational and NoSQL databases as well as vector stores (e.g., Postgres, Elasticsearch/OpenSearch, ChromaDB).

• Knowledge of Scala, Spark, Ray, or other distributed computing systems is highly preferred.

• Knowledge of API development, containerization, and machine learning deployment is highly preferred.

Experience with ML Ops/AI Ops is highly preferred

Jobseeker

Looking For Job?
Search Jobs

Recruiter

Are You Recruiting?
Search Candidates