ML Developer — Surveillance Detection Models

Accolite

2026-04-30

Gurgaon, Haryana, India

Top Skills:

Apache SparkArchitectureAsicAsset ClassAzureBertCalibrationComplianceComputationalData ProcessingDockerElasticsearchFinancial ComplianceFinancial ServiceGap AnalysisGovernanceKubernetesLexiconMachine LearningNatural Language ProcessingNlpNltkNumpyPipelinePostgresqlPythonPytorchRegulatory AuditScikit-learnSentiment AnalysisSpacyTensorflowText Classification

Role Overview

We are seeking an ML Developer with expertise in natural language processing and financial compliance to design, implement, and benchmark deterministic and transformer-based detection models for electronic communication surveillance. This role is central to Bank's ability to detect market abuse, conduct risk, information barrier breaches, and off-channel evasion across trader communications.

ASIC INFO 283 explicitly warns against reliance on vendor default alert thresholds, requiring licensees to calibrate models to their specific risk profile. You will build the model benchmarking framework that continuously tests and measures detection model effectiveness, enabling NAB to demonstrate to regulators that its surveillance models are tuned, validated, and performing to measurable standards.

Required Qualifications

5+ years in machine learning engineering, with at least 3 years in NLP/text classification in financial services or compliance
Strong proficiency in Python (scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers)
Hands-on experience fine-tuning pre-trained language models (BERT, RoBERTa, GPT-family) for domain-specific tasks
Experience building model evaluation and benchmarking pipelines with automated metric tracking and drift detection
Understanding of financial services compliance: market abuse, insider trading, front-running, conduct risk
Experience with threshold tuning, FP/FN trade-off analysis, and precision-recall optimisation
Familiarity with model explainability frameworks (SHAP, LIME) and model governance requirements
Bachelor’s or Master’s degree in Computer Science, Machine Learning, Statistics, or Computational Linguistics

Preferred Qualifications

Experience with surveillance platforms (NICE Actimize, Behavox, Shield FC) and their detection model architectures
Knowledge of ASIC INFO 283 model calibration requirements
Experience with voice analytics: speech-to-text, speaker diarisation, tonality/sentiment analysis
Familiarity with active learning and human-in-the-loop ML workflows
PhD in NLP, Computational Linguistics, or Machine Learning

Technical Skills & Tools

ML frameworks: PyTorch, TensorFlow, scikit-learn, Hugging Face Transformers, spaCy, NLTK
NLP: BERT, RoBERTa, FinBERT, GPT-family, Word2Vec, FastText, sentence-transformers
Data processing: pandas, NumPy, Apache Spark, Dask, polars
MLOps: MLflow, Kubeflow, Azure ML, Weights & Biases, DVC
Model evaluation: SHAP, LIME, AUC-ROC, precision-recall curves, PSI, KL divergence
Infrastructure: Docker, Kubernetes, GPU compute (NVIDIA A100/H100), Azure ML Compute
Databases: PostgreSQL, Elasticsearch, vector databases (Pinecone, Weaviate, pgvector)

Key Responsibilities

Design and implement deterministic detection models for eComms surveillance: market abuse language, insider information patterns, tipping-off phraseology, information barrier breaches, conduct risk, off-channel evasion, and trade-comms correlation
Develop and fine-tune transformer-based NLP models (BERT, RoBERTa, FinBERT) for context-aware detection beyond simple lexicon matching
Build and maintain a model benchmarking framework with ground truth datasets, precision/recall/F1 measurement, AUC-ROC analysis, and automated weekly benchmark runs
Implement threshold tuning workflows to calibrate alert sensitivity per desk, asset class, and jurisdiction — ensuring compliance with ASIC INFO 283 guidance
Design false positive reduction strategies: contextual filtering, trader baseline profiling, alert clustering, and analyst feedback loops
Develop false negative detection: red team simulated misconduct, historical replay testing, coverage gap analysis, and cross-model ensemble voting
Prepare communication data for transformer models: tokenisation, sequence formatting with context windowing, label engineering from investigations, and data augmentation
Implement model drift detection using PSI and KL divergence with automated alerts when distributions shift
Build champion-challenger evaluation: shadow-mode deployment with statistical significance testing before production promotion
Deliver model explainability using SHAP/LIME for regulatory audit readiness
Produce monthly model effectiveness scorecards for compliance committee review
Collaborate with the eComms pipeline team to ensure clean, normalised inputs for ML model training and inference

ML Developer — Surveillance Detection Models

Accolite

Gurgaon, Haryana, India

3-5 years

Not Disclosed

Full time

30 April 2026

Top Skills:

Get Personalized Job Matches with 1 Click

Job Description

Download Resume

5+ years in machine learning engineering, with at least 3 years in NLP/text classification in financial services or compliance
Strong proficiency in Python (scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers)
Hands-on experience fine-tuning pre-trained language models (BERT, RoBERTa, GPT-family) for domain-specific tasks
Experience building model evaluation and benchmarking pipelines with automated metric tracking and drift detection
Understanding of financial services compliance: market abuse, insider trading, front-running, conduct risk
Experience with threshold tuning, FP/FN trade-off analysis, and precision-recall optimisation
Familiarity with model explainability frameworks (SHAP, LIME) and model governance requirements
Bachelor’s or Master’s degree in Computer Science, Machine Learning, Statistics, or Computational Linguistics

Preferred Qualifications

Experience with surveillance platforms (NICE Actimize, Behavox, Shield FC) and their detection model architectures
Knowledge of ASIC INFO 283 model calibration requirements
Experience with voice analytics: speech-to-text, speaker diarisation, tonality/sentiment analysis
Familiarity with active learning and human-in-the-loop ML workflows
PhD in NLP, Computational Linguistics, or Machine Learning

Technical Skills & Tools

ML frameworks: PyTorch, TensorFlow, scikit-learn, Hugging Face Transformers, spaCy, NLTK
NLP: BERT, RoBERTa, FinBERT, GPT-family, Word2Vec, FastText, sentence-transformers
Data processing: pandas, NumPy, Apache Spark, Dask, polars
MLOps: MLflow, Kubeflow, Azure ML, Weights & Biases, DVC
Model evaluation: SHAP, LIME, AUC-ROC, precision-recall curves, PSI, KL divergence
Infrastructure: Docker, Kubernetes, GPU compute (NVIDIA A100/H100), Azure ML Compute
Databases: PostgreSQL, Elasticsearch, vector databases (Pinecone, Weaviate, pgvector)

Key Responsibilities

Design and implement deterministic detection models for eComms surveillance: market abuse language, insider information patterns, tipping-off phraseology, information barrier breaches, conduct risk, off-channel evasion, and trade-comms correlation
Develop and fine-tune transformer-based NLP models (BERT, RoBERTa, FinBERT) for context-aware detection beyond simple lexicon matching
Build and maintain a model benchmarking framework with ground truth datasets, precision/recall/F1 measurement, AUC-ROC analysis, and automated weekly benchmark runs
Implement threshold tuning workflows to calibrate alert sensitivity per desk, asset class, and jurisdiction — ensuring compliance with ASIC INFO 283 guidance
Design false positive reduction strategies: contextual filtering, trader baseline profiling, alert clustering, and analyst feedback loops
Develop false negative detection: red team simulated misconduct, historical replay testing, coverage gap analysis, and cross-model ensemble voting
Prepare communication data for transformer models: tokenisation, sequence formatting with context windowing, label engineering from investigations, and data augmentation
Implement model drift detection using PSI and KL divergence with automated alerts when distributions shift
Build champion-challenger evaluation: shadow-mode deployment with statistical significance testing before production promotion
Deliver model explainability using SHAP/LIME for regulatory audit readiness
Produce monthly model effectiveness scorecards for compliance committee review
Collaborate with the eComms pipeline team to ensure clean, normalised inputs for ML model training and inference