SlideShare a Scribd company logo
Data Science in Action
Applications of Data Science in Drug
Discovery, Financial Services, Project
Management, Human Resources and
Marketing
Data Science in Action
What this talk is about
My Journey
- Change of career
- Lessons learned
Industries covered
- Pharmaceutical
- Project management
- Financial services within a major UK bank
- Human resources and recruitment
- Marketing
Druuuuugssssss *the boring kind
Pubmed: a repository of all medical/biological
related literature
Every day hundreds of papers are published.
How can an expert in ontology know about
advances in diabetes research that will impact
his/her research?
Expediate the arrival at a Eureka! moment
Benevolent AI
Headquartered in London UK
Startup
Partnered with Astrazeneca
AI to the rescue!
- GPUs
- TPUs
- Lots of money
Combining free unstructured text with hand
curated databases in chemical-drug
protein-gene gene-drug databases to achieve
insight and aid the drug discovery scientist
reach their Eureka! moment.
NLP
- POS tagging
- Syntactic parsing
- Entity detection
Graph Theory
- Inferred edges
- Path analysis
Reinforcement Learning
- Software engineers
- Data Scientists
- Bionformaticians
- Drug discovery scientists
- Clever business type people
Show me the money!
Onset of open banking in the UK
Major banks want to get ahead of the curve by
extracting maximal insight from their data
Millins of transactional data
Transaction Classification -- what is a salary
payment? Are we losing money to competitors
when it comes to savings?
Pensions: how to identify trends in pensions and
attract/retain customers
Mudano
Consultancy
Startup
Visionaries in implementing ai in the project
management domain.
Give me time! Give me clarity!
Think Kanban, Atlassian JIRA
Think like a project manager-- how do I keep
track of all tickets of all projects of all my
human resources?
How do I streamline a project to cut waste
maximise production?
How do I identify pain points in a project? I.e.
issues that might delay delivery?
I want to hire the best!
Think like a recruiter.
How do I identify the optimum person for a job?
- Specific experience
- Specific qualification
- Likely to be ready to move
- Go beyond keyword matches
Often the client knows the person they want to
hire -- and want someone similar!
Pre-seed startup
Emerged from a startup incubator in London
(Founders Factory)
What to look for?
- Compare the companies a candidate worked for
- Large companies are distinctly different to smaller ones
- Similarity matrix -- dimensionality reduction
- Look for candidates who have similar job titles
- Semantic search
- From job descriptions
- Propensity to move: who is likely to be open to new job opportunities?
Who should I advertise to?
- Identifying target audiences online
- Demographics
- Online behaviour
- Location data
Want to know:
- Who is likely to visit a store
- Who is likely to click on a link
- Keeping things inline with GDPR
Part of the Ominicom Group
Operates like a startup (flat management, fluid
job description room to innovate)
With the backing of a large organisation
The common theme? Recommender Systems
Recommend --
- The protein which activates a gene
- A drug that activates the protein
- The ideal candidate for a job
- The most relevant notifications for a project
- The right audience for an advertisement
Recommender systems: deconstructed
Getting the data--
- Buy it
- Scrape it
- Mine it (via apps cookies etc.)
- Download it
Identifying data quality: startup pitfall
- Many startups jump straight into the ds model
- Don't allow time for data quality checks
- Or understanding the data
- Decide on the list of desirables in advance
- Check for missing variables
- Correlations
- Check the volume of data when joined to other data sets
- Ask: can we impute the data?
- What can we do with missing/incomplete/inaccurate data?
The Data Science Model: circle of life
- Don't be clever
- Start simple
- Iterate
- Play with the data! Feature selection, feature engineering
- Understand the data -- gain a little domain knowledge
- Measured by whether you can hold a conversation with a domain expert
- Understand what is required!
- Determine what the desired outcome is
- Good precision or recall
- Auc
- If unsupervised how to determine quality
- Specific gain for the business? (more revenue? More efficient work? New discoveries)
- Precision might come at the expense of discovery!
- Recall might be at the expense of efficiency
Supervised vs unsupervised
- Do you have training data?
- Is that training data reliable?
- What is the source?
- Mechanical Turk?
- Expert annotation?
- Is it biased?
- Is my training data copious?
- Can I combine golden corpora with silver?
ML vs DL
- Do you need transparency or explainability?
- E.g. legal or financial services
- How much data do you have?
- Does it support DL?
- Do you have the technology?
- Time?
- Money?
- DL models on GPUs are expensive
Presenting results
Conveying the:
- Significance
- Importance
- Limitations
Of a project to stakeholders/clients/management
- People who are non-experts in the field
Presenting the results: pearls of wisdom
- Don’t lie
- Don’t exaggerate
- Be clear
- Be honest
- Try to think like a stakeholder
- What do they want from the project?
- How do I present the importance and usefulness of the results?
- Explain the benefit of the complicates/time consuming/ expensive DS approach with the
easier/cheaper faster methods

More Related Content

DOCX
Self Study Business Approach to DS_01022022.docx
PPTX
Data Science applications in business
PPTX
Adding Open Data Value to 'Closed Data' Problems
PDF
GTU GeekDay Data Science and Applications
PDF
Programming for data science in python
PDF
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
PDF
Data Science
PDF
Introduction to Data Science
Self Study Business Approach to DS_01022022.docx
Data Science applications in business
Adding Open Data Value to 'Closed Data' Problems
GTU GeekDay Data Science and Applications
Programming for data science in python
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Data Science
Introduction to Data Science

What's hot (20)

PPTX
Data science applications and usecases
PPTX
Data science | What is Data science
PDF
Data science
PDF
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
PDF
Introduction to Data Science
PDF
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
PPTX
Big data and data science overview
PDF
Unit 3 part 2
PPTX
Data analytics
DOC
2005)
PPTX
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
PPTX
data science
PDF
1. introduction to data science —
PPTX
Introduction to Data Science by Datalent Team @Data Science Clinic #9
PPTX
A Practical-ish Introduction to Data Science
PDF
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
PPTX
Session 01 designing and scoping a data science project
PDF
Machine learning in action at Pipedrive
PDF
Training in Analytics and Data Science
PDF
Introduction to Data Science and Analytics
Data science applications and usecases
Data science | What is Data science
Data science
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Big data and data science overview
Unit 3 part 2
Data analytics
2005)
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
data science
1. introduction to data science —
Introduction to Data Science by Datalent Team @Data Science Clinic #9
A Practical-ish Introduction to Data Science
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Session 01 designing and scoping a data science project
Machine learning in action at Pipedrive
Training in Analytics and Data Science
Introduction to Data Science and Analytics
Ad

Similar to Data Science in Action (20)

PPTX
Getting into ai event slides
PPTX
Big data for sales and marketing people
PDF
ICIC 2014 The Information World Doesn’t Stop at Patents!
PPTX
InfoTools: Beyond Search
PPTX
Actionable Intelligence: Finding Insights & Opportunities
PDF
Lessons Learned from Hiring and Retaining Data Practitioners
PDF
HPE IDOL Technical Overview - july 2016
PPT
Information Architecture 101
PDF
Marcus Baker: People Analytics at Scale
PDF
Introduction-to-Data-Science.pdf
PDF
Introduction-to-Data-Science.pdf
PDF
Data sci sd-11.6.17
PDF
Digital Strategy for Cultural Heritage Institutions
PPTX
Does market information, marketing and consumer research have a role in busin...
DOCX
What is Data Science?
PDF
What is data science? No really, what is a data scientist?
PDF
D92-198gstindspdx
PPTX
Sentient Services (Ubiquity Marketing Un Summit 2009) V1
PDF
Science to Data Science: PhDs and postdocs moving to startups and industry (2...
PPTX
Fundamentals of Data Analytics Outline
Getting into ai event slides
Big data for sales and marketing people
ICIC 2014 The Information World Doesn’t Stop at Patents!
InfoTools: Beyond Search
Actionable Intelligence: Finding Insights & Opportunities
Lessons Learned from Hiring and Retaining Data Practitioners
HPE IDOL Technical Overview - july 2016
Information Architecture 101
Marcus Baker: People Analytics at Scale
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdf
Data sci sd-11.6.17
Digital Strategy for Cultural Heritage Institutions
Does market information, marketing and consumer research have a role in busin...
What is Data Science?
What is data science? No really, what is a data scientist?
D92-198gstindspdx
Sentient Services (Ubiquity Marketing Un Summit 2009) V1
Science to Data Science: PhDs and postdocs moving to startups and industry (2...
Fundamentals of Data Analytics Outline
Ad

More from Jordan Open Source Association (20)

PPTX
JOSA TechTalks - Data Oriented Architecture
PPTX
JOSA TechTalks - Machine Learning on Graph-Structured Data
PDF
OpenSooq Mobile Infrastructure @ Scale
PDF
Data-Driven Digital Transformation
PDF
Processing Arabic Text
PDF
JOSA TechTalks - Downgrade your Costs
PDF
JOSA TechTalks - Docker in Production
PPTX
JOSA TechTalks - Word Embedding and Word2Vec Explained
PDF
JOSA TechTalks - Better Web Apps with React and Redux
PDF
JOSA TechTalks - RESTful API Concepts and Best Practices
PDF
Web app architecture
PDF
Intro to the Principles of Graphic Design
ODP
Intro to Graphic Design Elements
PDF
JOSA TechTalk: Realtime monitoring and alerts
PPTX
JOSA TechTalk: Metadata Management
in Big Data
ODP
JOSA TechTalk: Introduction to Supervised Learning
PDF
JOSA TechTalk: Taking Docker to Production
PDF
JOSA TechTalk: Introduction to docker
PDF
D programming language
PDF
A taste of Functional Programming
JOSA TechTalks - Data Oriented Architecture
JOSA TechTalks - Machine Learning on Graph-Structured Data
OpenSooq Mobile Infrastructure @ Scale
Data-Driven Digital Transformation
Processing Arabic Text
JOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Docker in Production
JOSA TechTalks - Word Embedding and Word2Vec Explained
JOSA TechTalks - Better Web Apps with React and Redux
JOSA TechTalks - RESTful API Concepts and Best Practices
Web app architecture
Intro to the Principles of Graphic Design
Intro to Graphic Design Elements
JOSA TechTalk: Realtime monitoring and alerts
JOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Introduction to Supervised Learning
JOSA TechTalk: Taking Docker to Production
JOSA TechTalk: Introduction to docker
D programming language
A taste of Functional Programming

Recently uploaded (20)

PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
August Patch Tuesday
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Mushroom cultivation and it's methods.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Tartificialntelligence_presentation.pptx
PPTX
1. Introduction to Computer Programming.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPT
Teaching material agriculture food technology
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
A comparative analysis of optical character recognition models for extracting...
Group 1 Presentation -Planning and Decision Making .pptx
Network Security Unit 5.pdf for BCA BBA.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
August Patch Tuesday
Spectral efficient network and resource selection model in 5G networks
Reach Out and Touch Someone: Haptics and Empathic Computing
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Mushroom cultivation and it's methods.pdf
Encapsulation theory and applications.pdf
Unlocking AI with Model Context Protocol (MCP)
Assigned Numbers - 2025 - Bluetooth® Document
Tartificialntelligence_presentation.pptx
1. Introduction to Computer Programming.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Mobile App Security Testing_ A Comprehensive Guide.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Teaching material agriculture food technology
cloud_computing_Infrastucture_as_cloud_p
Digital-Transformation-Roadmap-for-Companies.pptx

Data Science in Action

  • 1. Data Science in Action Applications of Data Science in Drug Discovery, Financial Services, Project Management, Human Resources and Marketing
  • 3. What this talk is about My Journey - Change of career - Lessons learned Industries covered - Pharmaceutical - Project management - Financial services within a major UK bank - Human resources and recruitment - Marketing
  • 4. Druuuuugssssss *the boring kind Pubmed: a repository of all medical/biological related literature Every day hundreds of papers are published. How can an expert in ontology know about advances in diabetes research that will impact his/her research? Expediate the arrival at a Eureka! moment Benevolent AI Headquartered in London UK Startup Partnered with Astrazeneca
  • 5. AI to the rescue! - GPUs - TPUs - Lots of money Combining free unstructured text with hand curated databases in chemical-drug protein-gene gene-drug databases to achieve insight and aid the drug discovery scientist reach their Eureka! moment. NLP - POS tagging - Syntactic parsing - Entity detection Graph Theory - Inferred edges - Path analysis Reinforcement Learning - Software engineers - Data Scientists - Bionformaticians - Drug discovery scientists - Clever business type people
  • 6. Show me the money! Onset of open banking in the UK Major banks want to get ahead of the curve by extracting maximal insight from their data Millins of transactional data Transaction Classification -- what is a salary payment? Are we losing money to competitors when it comes to savings? Pensions: how to identify trends in pensions and attract/retain customers Mudano Consultancy Startup Visionaries in implementing ai in the project management domain.
  • 7. Give me time! Give me clarity! Think Kanban, Atlassian JIRA Think like a project manager-- how do I keep track of all tickets of all projects of all my human resources? How do I streamline a project to cut waste maximise production? How do I identify pain points in a project? I.e. issues that might delay delivery?
  • 8. I want to hire the best! Think like a recruiter. How do I identify the optimum person for a job? - Specific experience - Specific qualification - Likely to be ready to move - Go beyond keyword matches Often the client knows the person they want to hire -- and want someone similar! Pre-seed startup Emerged from a startup incubator in London (Founders Factory)
  • 9. What to look for? - Compare the companies a candidate worked for - Large companies are distinctly different to smaller ones - Similarity matrix -- dimensionality reduction - Look for candidates who have similar job titles - Semantic search - From job descriptions - Propensity to move: who is likely to be open to new job opportunities?
  • 10. Who should I advertise to? - Identifying target audiences online - Demographics - Online behaviour - Location data Want to know: - Who is likely to visit a store - Who is likely to click on a link - Keeping things inline with GDPR Part of the Ominicom Group Operates like a startup (flat management, fluid job description room to innovate) With the backing of a large organisation
  • 11. The common theme? Recommender Systems Recommend -- - The protein which activates a gene - A drug that activates the protein - The ideal candidate for a job - The most relevant notifications for a project - The right audience for an advertisement
  • 13. Getting the data-- - Buy it - Scrape it - Mine it (via apps cookies etc.) - Download it
  • 14. Identifying data quality: startup pitfall - Many startups jump straight into the ds model - Don't allow time for data quality checks - Or understanding the data - Decide on the list of desirables in advance - Check for missing variables - Correlations - Check the volume of data when joined to other data sets - Ask: can we impute the data? - What can we do with missing/incomplete/inaccurate data?
  • 15. The Data Science Model: circle of life - Don't be clever - Start simple - Iterate - Play with the data! Feature selection, feature engineering - Understand the data -- gain a little domain knowledge - Measured by whether you can hold a conversation with a domain expert - Understand what is required! - Determine what the desired outcome is - Good precision or recall - Auc - If unsupervised how to determine quality - Specific gain for the business? (more revenue? More efficient work? New discoveries) - Precision might come at the expense of discovery! - Recall might be at the expense of efficiency
  • 16. Supervised vs unsupervised - Do you have training data? - Is that training data reliable? - What is the source? - Mechanical Turk? - Expert annotation? - Is it biased? - Is my training data copious? - Can I combine golden corpora with silver?
  • 17. ML vs DL - Do you need transparency or explainability? - E.g. legal or financial services - How much data do you have? - Does it support DL? - Do you have the technology? - Time? - Money? - DL models on GPUs are expensive
  • 18. Presenting results Conveying the: - Significance - Importance - Limitations Of a project to stakeholders/clients/management - People who are non-experts in the field
  • 19. Presenting the results: pearls of wisdom - Don’t lie - Don’t exaggerate - Be clear - Be honest - Try to think like a stakeholder - What do they want from the project? - How do I present the importance and usefulness of the results? - Explain the benefit of the complicates/time consuming/ expensive DS approach with the easier/cheaper faster methods