Data Scientist - R01562490
Brillio 2
Posted: March 16, 2026
Interested in this position?
Create a free account to apply with AI-powered matching
Quick Summary
Develop and apply statistical and machine learning techniques to analyze and interpret complex data, using Python, PySpark, and other tools.
Required Skills
Job Description
Data Scientist
Primary Skills:
• Hypothesis Testing, T-Test, Z-Test, Regression (Linear, Logistic), Python/PySpark, SAS/SPSS, Statistical analysis and computing, Probabilistic Graph Models, Great Expectation, Evidently AI, Forecasting (Exponential Smoothing, ARIMA, ARIMAX), Tools(KubeFlow, BentoML), Classification (Decision Trees, SVM), ML Frameworks (TensorFlow, PyTorch, Sci-Kit Learn, CNTK, Keras, MXNet), Distance (Hamming Distance, Euclidean Distance, Manhattan Distance), R/ R Studio
Specialization:
• Data Science Advanced: AI/ML Engineer
Job requirements:
• Qualifications
• Bachelor's or Master's degree in Computer Science, Data Science, or a related field with 3-5 years’ experience o Proven experience as a Data Engineer or similar role, with a focus on AI/ML projects.
• Strong proficiency in Python programming and experience with relevant libraries and frameworks (e.g., pandas, NumPy, scikit-learn, TensorFlow, PyTorch).
• Solid understanding of data engineering concepts, data modeling, and database systems (e.g., SQL, NoSQL).
• Experience with data integration and ETL tools (e.g., Apache Airflow, Apache Spark, Talend). o Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and related services (e.g., S3, EC2, BigQuery).
• Strong problem-solving skills and ability to work in a fast-paced, collaborative environment.
•
• Job Description & Responsibilities Data Pipeline Development:
• Design, develop, and maintain scalable and efficient data pipelines to collect, clean, and transform large volumes of data.
• Collaborate with software engineers, and other stakeholders to understand data requirements and implement effective solutions.
• Ensure data pipelines are robust, reliable, and optimized for performance. Data Modeling and Integration:
• Design and implement data models that support the storage, retrieval, and analysis of structured and unstructured data.
• Integrate and consolidate data from various sources, both internal and external, to create a unified and comprehensive data ecosystem.
• Ensure data integrity and accuracy through data quality assessments, cleansing, and validation techniques. Machine Learning Implementation:
• Optimize and enhance machine learning algorithms for performance, scalability, and accuracy.
• Implement data preprocessing, feature engineering, and model training workflows using Python and relevant libraries (e.g., scikit-learn, TensorFlow, PyTorch). Data Infrastructure Management
• Configure and maintain cloud-based infrastructure for data storage, processing, and analysis.
• Monitor and troubleshoot data-related issues, ensuring high availability and reliability of data systems.
• Stay up-to-date with emerging technologies, tools, and best practices in data engineering, AI, and ML.