Data Science & Machine Learning

This 3-day workshop is a unique opportunity to learn how financial data can be applied with machine learning to generate trading strategies across all asset classes. Delegates will gain invaluable insights into data science techniques such as selection, analysis and cleansing, before discovering how this data can be applied to the most effective machine learning methods. Backtesting and execution strategies are also covered in order to provide the complete tool box for developing data and machine learning driven trading strategies.

The course makes extensive use of Python packages such as Pandas, Scikit-learn, LightGBM.

09 - 11 December 2019


3 Days


London, UK – Tower Hotel, London E1


Ernest Chan

Course Fee:

£2590 +VAT


Challenges of financial data science and machine learning

  • Data cleansing: Why even simple daily data cannot be trusted
  • Features engineering: Claims that this step is easy for deep learning are false
  • Features selection: What even experts can get wrong here
  • Machine learning: shallow + deep learning work best together
  • Avoiding data snooping and selection bias: using CPCV
  • Metalabelling: improving your proprietary strategy without telling anyone
  • Backtesting: beyond machine learning
  • Automated execution: choosing a platform

Data cleansing and features engineering

  • Checking and adjusting price and volume data in stocks and futures
  • Survivorship bias and how to find it
  • Stationarity and “fractional differentiation”
  • Sanity checks for news sentiment data
  • Sanity checks for earnings data
  • What is a security master and how to create one where none existed?
  • Aggregating and encoding categorical data into features

Machine learning

  • Simple features and shallow ML using logistic regression with L1 and L2 regularizations
  • Deeper learning: Random forests and gradient boosted trees with Scikit-Learn and LightGBM
  • Features selection using Mean Decrease Accuracy and SHAP: be careful where you apply that!
  • Cross validation and hyperparameters optimization
  • Metrics for measuring machine learnin outcomes
  • Metalabelling: what common base models to use?


  • Machine learning suggests, but does not determine, trading strategy
  • Various ways of using the output of ML for trading
  • Reduce data snooping bias: using CPCV