HRDF Funded Course Text Mining with R
Details
It is estimated that over 70% of potentially useable business information is unstructured, often in the form of text data. Text mining provides a collection of techniques that allow us to derive actionable insights from these data.
This course will show you the various tools and major techniques for mining and analyzing text data to discover interesting patterns, extract useful knowledge, and support decision making, with an emphasis on statistical approaches, to making sense of unstructured data. Work with a live example of extraction of data from Web and perform all the facets of text mining using R.
The topics include:
- Sentiment analysis
- Word cloud
- Ngrams
- Topics Modeling
- LDA
- Extracting text from social media
Outline
Module 1: Introduction
- What is text mining
- Applications of text mining
Module 2: Basic Text Functions
- Text manipulation functions
- Working with strings
- Working with gsub
- Advanced methods
- Convert to corpus
Module 3: Importing Data
- Converting docx into corpus
- Converting pdf into corpus
- Converting html to corpus
- Web scraping
Module 4: Tidytext Package
- Tidying text objects
- Tidying document term matrix objects
- Tidying document frequency matrix objects
- Tidying corpus objects
- Mining literacy works
Module 5: Word Frequencies & Relationships
- Pre-processing text
- Wordcloud
- Frequency analysis
- nGrams & bigrams
- Bigrams for sentiment analysis
- Visualizing bigrams network
Module 6: Sentiment Analysis
- Sentiment libraries
- Analyzing positive & negative words
- Comparing 3 sentiment libraries
- Common positive & negative words
Module 7: Topic Modelling
- Latent Semantic Indexing (LSI)
- Latent Dirichlet Allocation (LDA)
- Word topic probabilities
- Document - topic probabilities
- Chapters probabilities
- Per document classification
Module 8: Document Similarity & Classifier
- Text alignment & pairwise comparison
- Minihashing and locality sensitive hashing
- Extract key words
- Classify by location, language, topic
Module 9: Working internet and social media (Optional)
- Extracting data from amazon
- Extracting data from twitter
- Extracting youtube comments
- Extracting facebook comments
Speaker/s
All our courses and trainings are funded by HRDF (Human Resources Development Fund Malaysia). Our courses include Infocomm, Digital Media, Robotics, Semiconductor,Telecommunication, Life Science, Horticulture Industries , and Business Administration . Below are some of our popular courses
- Python Programming
- R Programming
- Tableau
- Machine Learning
- Raspberry Pi
- Arduino
- 3D Printing
- iOS Apps Development
- Android Apps Development
- Magento eCommerce
- Wordpress
- Joomla
- Search Engine Optimizatoin
- Web Design
- Google Analytics
- Facebook Marketing