# Text[¶](https://doc.dataiku.com/dss/latest/unstructured-data/text/index.html#text "Permalink to this headline")

A large amount of information is available in the form of text. For example, tweets, emails, survey responses, product reviews and so forth contain information that is written in natural language.

The goal of working with text is to convert it into data that can be useful for analysis. Some applications of text analysis include: sentiment analysis, named entity recognition, summarization, and so forth.

The following table lists the plugins currently available for working with text data.

Note

**Support level**: These plugins are not supported / Tier 2 supported features

| Plugin | Description | Language coverage |

| --- | --- | --- |

| Text preparation | Detect languages, correct misspellings and clean text data using open source libraries |
Language detection: 114
Spell checking: 37
Text cleaning: 59
|

| Text Analysis | Analyze text data with ontology tagging | 59 languages |

| Sentiment analysis | Estimate sentiment polarity (positive/negative) of text data using open source models | English |

| Text summarization | Automatically summarize text data using open source algorithms to extract sentences | Language-agnostic |

| Named entity recognition | Extract information on named entities (people, dates, places, etc.) from text data using open source models | 7 languages |

| Speech to Text | Convert speech to text offline using open-source components | English |

| Amazon Transcribe | Use the Amazon Transcribe API to convert speech to text | 40 languages |

| Sentence embedding | Compute numerical sentence representations for use as feature vectors in a Machine Learning model or for similarity search, using open source models | English |

| Amazon Comprehend | Use the Amazon Comprehend API for language detection, sentiment analysis, named entity recognition and key phrase extraction |
Language detection: 100
Other tasks: 12
|

| Amazon Comprehend Medical | Use the Amazon Comprehend Medical API for Protected Health Information extraction and medical entity recognition | English |

| Azure Cognitive Services – Text Analytics | Use the Azure Cognitive Services – Text Analytics API for language detection, sentiment analysis, named entity recognition and key phrase extraction |
Language detection: 108
Sentiment analysis: 13
Named entity recog.: 23
Key phrase extraction: 16
|

| Crowlingo Multilingual NLP | Use the Crowlingo Multilingual NLP API for language detection, sentiment analysis, summarization and multiple other tasks | 102 languages |

| Google Cloud NLP | Use the Google Cloud NLP API for sentiment analysis, named entity recognition and text classification |
Sentiment analysis: 16
Named entity recog.: 11
Text classification: English
|

| Google Cloud Translation | Use the Google Cloud Translation API to translate text | 109 languages |

| Amazon Translation | Use the Amazon Translation API to translate text | 71 languages |

| Azure Translation | Use the Azure Translation API to translate text | 90 languages |

| DeepL Translation | Use the DeepL Translation API to translate text | 28 languages |

| Offline Translation | Translate text offline using open-source components | 100 languages |

| MeaningCloud | Use the MeaningCloud API for language detection, sentiment analysis, topic extraction, summarization and text classification |
Language detection: 180
Sentiment analysis: 10
Topic extraction: 13
Summarization: language-agnostic
Text classification: 2
|

| NLG Tasks | Use the OpenAI API to perform tasks expressed in natural language, such as Zero-shot Classification or Q&A | English |

| Tesseract OCR | Perform Optical Character Recognition (OCR) offline using the Tesseract engine | 100 languages |
