description | coverY |
---|---|
Natural Language Processing (NLP), also referred to as text analytics such as apply machine learning on text, building language models, building chatbots, building sentiment analysis systems, etc. |
0 |
There are no prerequisites for this session other than knowledge of Statistics and ML.
The different areas where text analytics is applied such as
- healthcare
- e-commerce
- retail
- financial
- various other industries
There are three stages in text analytics:
There are two most popular encoding standards:
- American Standard Code for Information Interchange (ASCII)
- Unicode
- UTF-8
- UTF-16
The first encoding standard that came into existence was the ASCII (American Standard Code for Information Interchange) standard, in 1960. ASCII standard assigned a unique code to each character of the keyboard which was known as ASCII code.
Unicode
When ASCII was built, English alphabets were the only alphabets that were present on the keyboard. With time, new languages began to show up on keyboard sets which brought new characters. ASCII became outdated and couldn’t incorporate so many languages. A new standard has come into existence in recent years - the Unicode standard. It supports all the languages in the world - both modern and the older ones.
While UTF-8 uses only 8 bits to store the character, UTF-16 (BE) uses 16 bits to store it.