用大数据实时分析航空公司的客户满意度

作者：王萌
2012年05月03日
航空

flight_attendant_male

大数据分析是时下最流行的话题，很多人认为这不过是IT厂商的炒作，但实际上大数据分析应用得当的话，确实能以极低的成本交付极高的商业价值。

下面介绍一个航空业利用大数据实时分析客户满意度的案例。剑桥航空研究机构的总裁Jeffrey Breen利用R语言分析了消费者在Twitter上公开表达的，对几家主要美国航空公司的态度倾向。详见PPT：

R by example: mining Twitter for consumer attitudes towards airlines

在PPT的27页，Breen通过Twitter分析出的航空公司用户满意度与美国用户满意度指数（ACSI）的调查结果非常接近。

这个结果的惊人之处在于：Breen使用的是免费的R语言和公开的Twitter数据，而ACSI则是费时费力，且成本很高的市场调查研究项目，而且社会化数据分析一个人就可以完成。更重要的是，社会化数据分析相比传统市场调查最大的优势是其实时性。

ACSI给出的美国各行业客户满意度排行榜：

航空业客户满意度看上去还不错，比银行、服装、汽车行业排名高，但下表可以看出比酒店、餐饮和超市等行业的要低：

航空业Twitter文本数据分析流程：

Twitter数据分析结果与ACSI调查结果的对比（基本一致，西南航空遥遥领先其他航空公司）：

进行Twitter信息倾向性判断（正面还是负面）的R语言代码实例：

score.sentiment = function(sentences, pos.words, neg.words,. progress=’none’)

{

require(plyr)

require(stringr)

# we got a vector of sentences. Plyr will handle a list or a vector as an "l" for us

# we want a simple array of scores back, so we use "l" + "a" + "ply" = laply:

scores= laply(sentences, function(sentence, pos.words, neg.words) {

# clean up sentences with R’s regex-driven global substitute, gsub( ):

Sentence = gsub(‘[[:punct:]]’,”,sentence)

Sentence = gsub(‘[[:cntrl:]]’,”,sentence)

Sentence = gsub(‘\\d+’,”,sentence)

# and convert to lowercase:

Sentence = tolower(sentence)

# split into words. str_split is in the stringr package

word.list = str_split(sentence,’\\s+’)

# sometimes a list() is one level of hierarchy too much

Words = unlist(word.list)

# compare our words to the dictionaries of positive & negative terms

pos.matches = match(words, pos.words)

neg.matches = match(words, neg.words)

# match() returns the position of the matched term or NA

# we just want a TRUE/FALSE:

pos.matches = !is.na(pos.matches)

neg.matches = !is.na(neg.matches)

# and conveniently enough, TRUE/FALSE will be treated as 1/0 by sum():

Score = sum(pos.matches) – sum(neg.matches)

return(score)

},pos.words, neg.words,. progress=.progress)

scores.df = data.frame(score=scores, text=sentences)

return(scores.df)}


第一时间获取面向IT决策者的独家深度资讯，敬请关注IT经理网微信号：ctociocom


   除非注明，本站文章均为原创或编译，未经许可严禁转载。


相关文章：


                    最好的机器学习情绪分析工具（创业公司）TOP5
                                      2016年文本、语义、社交分析十大趋势
                                      大数据时代的Google？市场智能平台Ekho获得120万美元投资
                                      文本分析市场最有潜力的三家创业公司
                                      警务大数据：纽约警察使用Twitter预测犯罪
                                      营销技术创业公司Radius获得1300万美元融资
                                      语义互联网的三大死因
                                      用社交网络分析企业管理架构缺陷
                  



标签： R, 文本挖掘, 社会化分析, 社会化媒体分析, 语义分析

关于作者王萌

在TMT领域具有十余年的咨询和创业经验。目前主要关注信息安全，同时密切关注云计算、社会化媒体、移动、企业2.0等领域的技术创新和商业价值。拥有美国麻省理工学院MBA学位和清华大学经济管理学院学士学位，曾任BDA中国公司高级顾问，服务过美国高通、英特尔、中国网通、SK电讯、及沃达丰等公司。联系邮件：wangmeng@ctocio.com

上一篇«大数据创投火爆：Birst、DataSift完成融资

首个大数据市场趋势综合报告出炉»下一篇