Text mining is a useful tool for making an overview of subjects or important words in a text collection such as website, books, articles, etc.
Creating a word cloud from text files with R is easy. The first thing to do is to install R and two packages : “tm” and “wordcloud” (maybe these package will need others packages, you just have to follow R instructions). Then, put all the text files you want to analyze in the same directory, and write the following code in R :
# Loading libraries library(tm) library(wordcloud) # Define the folder where the text files are a <-Corpus(DirSource("C:/MyPath/FolderContaining/TxtFiles"), readerControl = list(language="lat")) # Preprocessing text a <- tm_map(a, removeNumbers) # Not necessary if numbers are important for you a <- tm_map(a, removePunctuation) a <- tm_map(a , stripWhitespace) a <- tm_map(a, tolower) # Stopwords are words such as "we" "the" "and" "so", etc. You can add your own words to the list a <- tm_map(a, removeWords, c(stopwords("english"), "can", "also", "may")) # a <- tm_map(a, stemDocument, language = "english") # You can also do steamming if you want # Computing the term document matrix tdm <- TermDocumentMatrix(a) # Transforming data for wordcloud m <- as.matrix(tdm) v <- sort(rowSums(m), decreasing=TRUE) myNames <- names(v) d <- data.frame(word=myNames, freq=v) # Making and displaying the cloud wordcloud(d$word, d$freq, min.freq=150)