Text Analytics of Multicultural, Metaphorical, and Multilingual Language

Text analytics using the “bag of words” technique is fairly proven within a given domain in complex organizations. The bigger challenges tend to be dealing with personalized jargon, abbreviations, and terms that are different but refer to the same thing or have a near similar meaning. A text analytics system can “read” these strings and determine the probably of a set number of hypotheses. Depending on the potential severity, they can be flagged for review by human analysts which may/may not result in an investigation which is easier by a human expert. Despite the limitations, screening the possible “needles in the haystack” is hugely valuable and results in business benefit especially when applied to functions such as insurance claims fraud, customer preferences, etc. Taxonomies of terms can be developed to create hierarchical linkage from general terms such as “disaster” to earthquake, wildfire, terror, explosion, etc.

In business, text is typically captured in a single, or several major languages. On-line translators are prevalent and relevant in the bag of words application. Their primary shortfall is that the automated translators do not yet mirror common syntax and oftentimes miss the meaning. To a native speaker, they are awful.

Text mining metaphorical language such as dreams and spiritual experiences adds an entirely new level of complexity. Decoding metaphors is more art than science. It is more plausible solely from a secular perspective using Freudian and Jungian methods. But with a holistic approach that includes a spiritual dimension, it is even more complex. Like psychology, there are patterns from ancient cultures such as Judaism that have been developed into a practical methodology that can produce consistent results. The most notable is from Streams Ministries in Dallas, TX which has trained and certified hundreds of interpreters around the world. Even with this community, it is too tedious to interpret thousands of dreams simply to find patterns and predictions.

Lastly, the decoding of metaphorical language must be culturally contextual. For example, if a Hindu dreams of a cow, it is unlikely that it refers to food. We all dream in our own culture.

We are developing the data science techniques to perform text analytics that is multicultural and multilingual but is still in its metaphorical state. This will enable us to operate at scale and not be misled by currently crude automated dream interpretation. We are learning this process using manual keyword search with the aggregated database from the Eyes2C.org journal. We maintain the privacy of the dreamer while we discover insights that are simply amazing and pique our curiosity to continue to follow the rabbit trail wherever it leads.

Jeff O’Dell
Founder, Eyes2C

Pin It