Product manufacturing has a set of desired usages and intended usages. Sometime the list of usages are required to be communicated to regulatory authorities for registration purposes for example REACH. Not all the times product usages follow the intended list and some times user use / abuse the chemical wrongly and it needs careful analysis. Such misuse of chemicals result in health implications for example allergies to exposed area or other issues. User generated content from social media platforms can provide early warning about the product usages and resulting adverse effects if any, and how the product intended uses are managed. Such analysis will help companies plan for the product safety stewardship related management. Whenever group of people discuss about a given product or brand we can retrieve sentiment about the product. By monitoring the exchange of information or conversation in a set of population the adverse effects can be predicted.
The mapping and measuring of relationship and flows among people, groups, organizations, web sites and other actors is called social Network analysis. Analysing the text in those networked applications is nothing but text mining. Text mining is a specialized field that applies data mining technique to extract sentiment of individuals or communities. We very well know how the Facebook posts for sentiment were used during the Arab Spring, and there have been many use cases around disease surveillance like H1N1. Sentiment analysis can be a combination of supervised or semi supervised or un supervised classification tasks. The analysis methods uses dictionaries of words annotated with their semantic orientation, various methods like Naïve Bayes, Maximum Entropy and Support Vector machines are used in machine learning method. Details about the terms segmentation, stemming, entity extraction, stop words management can be easily found by searching the internet. Various methods to create text analysis using Python or R, involving oAuthHandler, NLP, gensim can be found in GitHub literature. This analysis can be two types either search history or streaming analysis as when it is updated.
Details on how the Text analysis can be performed in SAP HANA is given by many and mentioned in SAP community blog post and a book is also published on Text Analytics with SAP. I liked the video series on Text Analysis available in YouTube channel. SAP Predictive Analytics offers though somewhat primitive text processing module compared to SAP HANA and Data Text engine. There are many approaches to carry out the analysis, the overview consists of Authentication, Data collection, Data Cleaning, modelling and analysis and Final reporting. The Authentication and data collection involves getting the data from social media using APIs, Using REST API to search the content, processing HTTP request, processing streaming connection and sometimes responding to request as required. Once the data is collected, it is tokenized, and further vectorization is done.
Example above shows an hypothetical scenario on how the text analysis carried out from a twitter streaming lists information about the Napthalene use and its occupational impacts. Such detailed analysis is only a beginning to a series of details studies like Sentiment around your critical material suppliers (Supply Chain Network), your consumer sentiment and insights about how a particular brand is faring in the market. Sentiment analysis of a product associated with product safety stewardship related information will help companies in chalking competitive plans for their future products.
Thanks – Jak