Humberger Nav
mployee.me logo
ML Developer — Surveillance Detection Models
Accolite
linkedin
Gurgaon, Haryana, India
3-5 years
Not Disclosed
Full time
30 April 2026
Top Skills:
Apache SparkArchitectureAsicAsset ClassAzureBertCalibrationComplianceComputationalData ProcessingDockerElasticsearchFinancial ComplianceFinancial ServiceGap AnalysisGovernanceKubernetesLexiconMachine LearningNatural Language ProcessingNlpNltkNumpyPipelinePostgresqlPythonPytorchRegulatory AuditScikit-learnSentiment AnalysisSpacyTensorflowText Classification

96

Get Personalized Job Matches with 1 Click

Job Description iconJob Description
Download Resume iconDownload Resume
Role Overview

We are seeking an ML Developer with expertise in natural language processing and financial compliance to design, implement, and benchmark deterministic and transformer-based detection models for electronic communication surveillance. This role is central to Bank's ability to detect market abuse, conduct risk, information barrier breaches, and off-channel evasion across trader communications.

ASIC INFO 283 explicitly warns against reliance on vendor default alert thresholds, requiring licensees to calibrate models to their specific risk profile. You will build the model benchmarking framework that continuously tests and measures detection model effectiveness, enabling NAB to demonstrate to regulators that its surveillance models are tuned, validated, and performing to measurable standards.

Required Qualifications

  • 5+ years in machine learning engineering, with at least 3 years in NLP/text classification in financial services or compliance
  • Strong proficiency in Python (scikit-learn, PyTorch, TensorFlow, Hugging Face Transformers)
  • Hands-on experience fine-tuning pre-trained language models (BERT, RoBERTa, GPT-family) for domain-specific tasks
  • Experience building model evaluation and benchmarking pipelines with automated metric tracking and drift detection
  • Understanding of financial services compliance: market abuse, insider trading, front-running, conduct risk
  • Experience with threshold tuning, FP/FN trade-off analysis, and precision-recall optimisation
  • Familiarity with model explainability frameworks (SHAP, LIME) and model governance requirements
  • Bachelor’s or Master’s degree in Computer Science, Machine Learning, Statistics, or Computational Linguistics

Preferred Qualifications

  • Experience with surveillance platforms (NICE Actimize, Behavox, Shield FC) and their detection model architectures
  • Knowledge of ASIC INFO 283 model calibration requirements
  • Experience with voice analytics: speech-to-text, speaker diarisation, tonality/sentiment analysis
  • Familiarity with active learning and human-in-the-loop ML workflows
  • PhD in NLP, Computational Linguistics, or Machine Learning

Technical Skills & Tools

  • ML frameworks: PyTorch, TensorFlow, scikit-learn, Hugging Face Transformers, spaCy, NLTK
  • NLP: BERT, RoBERTa, FinBERT, GPT-family, Word2Vec, FastText, sentence-transformers
  • Data processing: pandas, NumPy, Apache Spark, Dask, polars
  • MLOps: MLflow, Kubeflow, Azure ML, Weights & Biases, DVC
  • Model evaluation: SHAP, LIME, AUC-ROC, precision-recall curves, PSI, KL divergence
  • Infrastructure: Docker, Kubernetes, GPU compute (NVIDIA A100/H100), Azure ML Compute
  • Databases: PostgreSQL, Elasticsearch, vector databases (Pinecone, Weaviate, pgvector)

Key Responsibilities

  • Design and implement deterministic detection models for eComms surveillance: market abuse language, insider information patterns, tipping-off phraseology, information barrier breaches, conduct risk, off-channel evasion, and trade-comms correlation
  • Develop and fine-tune transformer-based NLP models (BERT, RoBERTa, FinBERT) for context-aware detection beyond simple lexicon matching
  • Build and maintain a model benchmarking framework with ground truth datasets, precision/recall/F1 measurement, AUC-ROC analysis, and automated weekly benchmark runs
  • Implement threshold tuning workflows to calibrate alert sensitivity per desk, asset class, and jurisdiction — ensuring compliance with ASIC INFO 283 guidance
  • Design false positive reduction strategies: contextual filtering, trader baseline profiling, alert clustering, and analyst feedback loops
  • Develop false negative detection: red team simulated misconduct, historical replay testing, coverage gap analysis, and cross-model ensemble voting
  • Prepare communication data for transformer models: tokenisation, sequence formatting with context windowing, label engineering from investigations, and data augmentation
  • Implement model drift detection using PSI and KL divergence with automated alerts when distributions shift
  • Build champion-challenger evaluation: shadow-mode deployment with statistical significance testing before production promotion
  • Deliver model explainability using SHAP/LIME for regulatory audit readiness
  • Produce monthly model effectiveness scorecards for compliance committee review
  • Collaborate with the eComms pipeline team to ensure clean, normalised inputs for ML model training and inference