Cognitive Services

APPLIES TO: SDK v3

Microsoft Cognitive Services let you tap into a growing collection of powerful AI algorithms developed by experts in the fields of computer vision, speech, natural language processing, knowledge extraction, and web search. The services simplify a variety of AI-based tasks, giving you a quick way to add state-of-the-art intelligence technologies to your bots with just a few lines of code. The APIs integrate into most modern languages and platforms. The APIs are also constantly improving, learning, and getting smarter, so experiences are always up to date.

Intelligent bots respond as if they can see the world as people see it. They discover information and extract knowledge from different sources to provide useful answers, and, best of all, they learn as they acquire more experience to continuously improve their capabilities.

Language understanding

The interaction between users and bots is mostly free-form, so bots need to understand language naturally and contextually. The Cognitive Service Language APIs provide powerful language models to determine what users want, to identify concepts and entities in a given sentence, and ultimately to allow your bots to respond with the appropriate action. The five APIs support several text analytics capabilities, such as spell checking, sentiment detection, language modeling, and extraction of accurate and rich insights from text.

Cognitive Services provides these APIs for language understanding:

Learn more about language understanding with Microsoft Cognitive Services.

Knowledge extraction

Cognitive Services provides four knowledge APIs that enable you to identify named entities or phrases in unstructured text, add personalized recommendations, provide auto-complete suggestions based on natural interpretation of user queries, and search academic papers and other research like a personalized FAQ service.

  • The Entity Linking Intelligence Service annotates unstructured text with the relevant entities mentioned in the text. Depending on the context, the same word or phrase may refer to different things. This service understands the context of the supplied text and will identify each entity in your text.

  • The Knowledge Exploration Service provides natural language interpretation of user queries and returns annotated interpretations to enable rich search and auto-completion experiences that anticipate what the user is typing. Instant query completion suggestions and predictive query refinements are based on your own data and application-specific grammars to enable your users to perform fast queries.

  • The Academic Knowledge API returns academic research papers, authors, journals, conferences, topics, and universities from the Microsoft Academic Graph. Built as a domain-specific example of the Knowledge Exploration Service, the Academic Knowledge API provides a knowledge base using a graph-like dialog with search capabilities over hundreds of millions of research-related entities. Search for a topic, a professor, a university, or a conference, and the API will provide relevant publications and related entities. The grammar also supports natural queries like "Papers by Michael Jordan about machine learning after 2010".

  • The QnA Maker is an easy-to-use REST API and web-based service that trains AI to respond to users’ questions in a natural, conversational way. With optimized machine learning logic and the ability to integrate industry-leading language processing, QnA Maker distills semi-structured data like question and answer pairs into distinct, helpful answers.

Learn more about knowledge extraction with Microsoft Cognitive Services.

Speech recognition and conversion

Use the Speech APIs to add advanced speech skills to your bot that leverage industry-leading algorithms for speech-to-text and text-to-speech conversion, as well as speaker recognition. The Speech APIs use built-in language and acoustic models that cover a wide range of scenarios with high accuracy.

For applications that require further customization, you can use the Custom Recognition Intelligent Service (CRIS). This allows you to calibrate the language and acoustic models of the speech recognizer by tailoring it to the vocabulary of the application, or even to the speaking style of your users.

There are three Speech APIs available in Cognitive Services to process or synthesize speech:

The following resources provide additional information about adding speech recognition to your bot.

Learn more about speech recognition and conversion with Microsoft Cognitive Services.

The Bing Search APIs enable you to add intelligent web search capabilities to your bots. With a few lines of code, you can access billions of webpages, images, videos, news, and other result types. You can configure the APIs to return results by geographical location, market, or language for better relevance. You can further customize your search using the supported search parameters, such as Safesearch to filter out adult content, and Freshness to return results according to a specific date.

There are five Bing Search APIs available in Cognitive Services.

  • The Web Search API provides web, image, video, news and related search results with a single API call.

  • The Image Search API returns image results with enhanced metadata (dominant color, image kind, etc.) and supports several image filters to customize the results.

  • The Video Search API retrieves video results with rich metadata (video size, quality, price, etc.), video previews, and supports several video filters to customize the results.

  • The News Search API finds news articles around the world that match your search query or are currently trending on the Internet.

  • The Autosuggest API offers instant query completion suggestions to complete your search query faster and with less typing.

Learn more about web search with Microsoft Cognitive Services.

Image and video understanding

The Vision APIs bring advanced image and video understanding skills to your bots. State-of-the-art algorithms allow you to process images or videos and get back information you can transform into actions. For example, you can use them to recognize objects, people's faces, age, gender or even feelings.

The Vision APIs support a variety of image understanding features. They can identify mature or explicit content, estimate and accent colors, categorize the content of images, perform optical character recognition, and describe an image with complete English sentences. The Vision APIs also support several image and video processing capabilities, such as intelligently generating image or video thumbnails, or stabilizing the output of a video.

Cognitive Services provide four APIs you can use to process images or videos:

  • The Computer Vision API extracts rich information about images (such as objects or people), determines if the image contains mature or explicit content, and processes text (using OCR) in images.

  • The Emotion API analyzes human faces and recognizes their emotion across eight possible categories of human emotions.

  • The Face API detects human faces, compares them to similar faces, and can even organize people into groups according to visual similarity.

  • The Video API analyzes and processes video to stabilize video output, detects motion, tracks faces, and can generate a motion thumbnail summary of the video.

Learn more about image and video understanding with Microsoft Cognitive Services.

Additional resources

You can find comprehensive documentation of each product and their corresponding API references in the Cognitive Services documentation.