Knowledge Mining with Azure Search Technical Deck

Understanding the latent value in all content

Text
(1) Validate enrichment pipeline

Tags
“throwing”, “ball”, “girl”, “grass”, “basketball”
Caption
“A girl throwing a ball”

Entities
Persons
“Anita Christiansen”,
“Conrad Nuber”,
Locations
“Bothell”, “Woodinville”
Organization
“Litware Insurance Corp.”

Computer Vision
Face
Emotion
Content Moderator
Video Indexer
Custom Vision
Service
Custom Decision
Q-n-A Maker
Language
Understanding (LUIS)
Text Analytics
Bing Spell Check
Translator Text
Speaker
Recognition
Bing Speech
Custom Speech
Translator Speech
Unified Speech
Service
Bing Autosuggest
Bing Search
Bing Entity Search
Bing Statistics add-in
Bing Visual Search
Bing Custom Search

Management Free
Keyword search
Faceting
Geospatial support
Multi-Language Support
Suggestions/auto-complete
Customizable scoring models
Proximity Search
Synonyms
etc.

INGEST
Data in any
format, any
Azure store
ENRICH EXPLORE
Annotations
Cognitive skills
Search

Annotated
Documents
Customer
Data
Built-in Cognitive Skills
OCR,
Key Phrase Extraction,
People Names,
Company Names,
Sentiment Analyzer,
Computer Vision,
etc.
Search
Index
.pdf
.doc
.jpeg
…
Third Party Enrichers
Custom classification models,
Custom entity extraction,
etc.
Azure Machine
Learning

Annotated
Documents
OCR,
People Names,
Company Names,
Sentiment Analyzer,
Computer Vision,
etc.
Search
Index
etc.
Azure Machine
Learning
Customer
Data
.pdf
.doc
.jpeg
…

Annotated
Documents
Search
Index
OCR,
People Names,
Company Names,
Sentiment Analyzer,
Computer Vision,
etc.
etc.
Customer
Data
.pdf
.doc
.jpeg
…

Key Phrase Extraction
Sentiment Analysis
Organization Entity Extraction
Location Entity Extraction
Persons Entity Extraction
Language Detection
Face Detection
Tag Extraction
Celebrity Recognition
Landmark Detection
Handwriting Recognition (Preview)
Printed Text Recognition

…,
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"uri" "https://myskill.azurewebsites.net/api/OrgId"
"httpHeaders": {"Api-Key": "mySecret" },
"context": "/document/organizations/*" ,
"inputs":
[
{ "name": “organizationName", "source": "/document/organizations/*" },
],
"outputs":
[
{ "name": "organizationId", "targetName": "organizationId" }
]
},

{
"values": [
{
"recordId": "7cad2",
"data":
{
"myOuput1": “animals"
}
},
{
"data":
{
"myOutput1": “colors"
}
},
…
]
}
{
"values": [
{
"data":
{
"myInput1": "fox",
"myInput2": "cat",
}
},
{
"data":
{
"myInput1": "blue",
"myInput2": "red",
}
},
…
]
}

content
keyPhrases
organizations
docClass

content
normalized
images
language
tags
orgs
content
content

"skills": [
{
"@odata.type": "#Microsoft.Skills.Text.LanguageDetectionSkill",
"inputs":
[
{ "name": "text", "source": "/document/content" }
],
"outputs":
[
{ "name": "languageCode", "targetName": "myLanguageCode" },
{ "name": "languageName", "targetName": "myLanguageName" }
]
},

…,
{
"@odata.type": "#Microsoft.Skills.Text.NamedEntityRecognitionSkill",
"categories": [ "Organization" ],
"defaultLanguageCode": "en",
"inputs":
[
{ "name": "text", "source": "/document/content" },
"name" "languageCode" "source" "/document/myLanguageCode"
],
"outputs":
[
{ "name": "organizations", "targetName": "organizations" }
]
},

…,
{
"@odata.type": "#Microsoft.Skills.Text.NamedEntityRecognitionSkill",
"categories": [ "Organization" ],
"defaultLanguageCode": "en",
"inputs":
[
{ "name": "text", "source": "/document/content" },
"name" "languageCode" "source" "/document/myLanguagecode"
],
"outputs":
[
{ "name": "organizations", "targetName": "organizations" }
]
},

…,
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"uri" "https://myskill.azurewebsites.net/api/OrgId"
"context": "/document/organizations/*" ,
"httpHeaders": {"Api-Key": "mySecret" },
"inputs":
[
{ "name": “organizationName", "source": "/document/organizations/*" },
],
"outputs":
[
{ "name": "organizationId", "targetName": "organizationId" }
]
},

Search
Index
OCR,
People Names,
Company Names,
Sentiment Analyzer,
Computer Vision,
etc.
etc.
Customer
Data
.pdf
.doc
.jpeg
…
Annotated
Documents

/document
/languageCode /keyPhrases /organizations /images
/1
/2
/…
/n
/1
/2
/…
/n
organizationId
organizationId
organizationId
organizationId
/1
/2
/…
/n
tags
tags
tags
tags
document.pdf

Annotated
Documents
OCR,
People Names,
Company Names,
Sentiment Analyzer,
Computer Vision,
etc.
etc.
Customer
Data
.pdf
.doc
.jpeg
…
Search
Index

/document
/keyPhrases
/0
/1
/…
/n
/organizations
/0
/1
/…
/n
organizationId
organizationId
organizationId
organizationId
/images
/0
/1
/…
/n
tags
tags
tags
tags
New Indexer Property
{
…
"outputFieldMappings":
[
{
"sourceFieldName":
"/document/organizations/*/organizationId",
"targetFieldName":
"orgIds"
} ,
…
]
}

“Lorem ipsum dolor sit amet,
consectetur adipiscing elit, sed
do eiusmod tempor incididunt ut
labore et dolore magna aliqua. Ut
enim ad minim veniam, quis
nostrud exercitation ullamco
laboris nisi…”
Class A
Class B
Class C

laboris nisi…”
laboris nisi…”
Entity type A
Entity type B

Labeled
Data
Custom
Entity
Extraction
Template
Azure ML
Annotated
Documents
Customer
Data
Search
Index

Cognitive Search
Documentation | Sign up for Azure Search
Azure Machine Learning Package for Text Analytics
Documentation | Create a Data Science Virtual Machine
Cognitive Services
Documentation | Sign up

Knowledge Mining with Azure Search Technical Deck

More Related Content

Similar to Knowledge Mining with Azure Search Technical Deck (7)

More from Nicholas Vossburg (19)

Recently uploaded (20)

Knowledge Mining with Azure Search Technical Deck

Editor's Notes