书签 分享 收藏 举报 版权申诉 / 96
上传文档赚钱

类型用机器学习的方法理解社会媒体课件.ppt

  • 上传人(卖家):三亚风情
  • 文档编号:3272220
  • 上传时间:2022-08-15
  • 格式:PPT
  • 页数:96
  • 大小:14.36MB
  • 【下载声明】
    1. 本站全部试题类文档,若标题没写含答案,则无答案;标题注明含答案的文档,主观题也可能无答案。请谨慎下单,一旦售出,不予退换。
    2. 本站全部PPT文档均不含视频和音频,PPT中出现的音频或视频标识(或文字)仅表示流程,实际无音频或视频文件。请谨慎下单,一旦售出,不予退换。
    3. 本页资料《用机器学习的方法理解社会媒体课件.ppt》由用户(三亚风情)主动上传,其收益全归该用户。163文库仅提供信息存储空间,仅对该用户上传内容的表现方式做保护处理,对上传内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!
    4. 请根据预览情况,自愿下载本文。本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
    5. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007及以上版本和PDF阅读器,压缩文件请下载最新的WinRAR软件解压。
    配套讲稿:

    如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。

    特殊限制:

    部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。

    关 键  词:
    机器 学习 方法 理解 社会 媒体 课件
    资源描述:

    1、Understanding Social Mediawith Machine LearningXiaojin Zhujerryzhucs.wisc.eduDepartment of Computer SciencesUniversity of WisconsinMadison,USACCF/ADL Beijing 2013Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 20131/95Outline1234Spatio-Temporal Signal Recovery from Social MediaMachine Lea

    2、rning BasicsProbabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochastic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 20132/95Spatio-Temporal Signal Recovery from Social MediaOutlin

    3、e1234Spatio-Temporal Signal Recovery from Social MediaMachine Learning BasicsProbabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochastic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijin

    4、g 20133/95Spatio-Temporal Signal Recovery from Social MediaSpatio-temporal Signal:When,Where,How MuchDirect instrumental sensing is di cult and expensiveZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 20134/95Spatio-Temporal Signal Recovery from Social MediaHumans as SensorsZhu (U Wiscons

    5、in)Understanding Social MediaCCF/ADL Beijing 20135/95Spatio-Temporal Signal Recovery from Social MediaHumans as SensorsNot“hot trend”discovery:We know what event we want to monitorNot natural language processing for social media:We are given areliable text classifier for“hit”Our task:precisely estim

    6、ating a spatiotemporal intensity function fstof a pre-defined target phenomenon.Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 20136/95Spatio-Temporal Signal Recovery from Social MediaChallenges of Using Humans as SensorsKeyword doesnt always mean eventIII was just told I look like dead

    7、crow.Dont blame me if one day I treat you like a dead crow.Human sensors arent under our controlLocation stamps may be erroneous or missingIIII3%have GPS coordinates:(-98.24,23.22)47%have valid user profile location:Bristol,UK,New York50%dont have valid location informationHogwarts,In the tra c.blah

    8、,Sitting On A TacoZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 20137/95Spatio-Temporal Signal Recovery from Social MediaProblem DefinitionInput:A list of time and location stamps of the target posts.Output:fst Intensity of target phenomenon at location s(e.g.,NewYork)and time t(e.g.,0-

    9、1am)Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 20138/95Spatio-Temporal Signal Recovery from Social MediaWhy Simple Estimation is Badfst=xst,the count of target posts in bin(s,t)Justification:MLE of the model x Poisson(f)However,IIIPopulation Bias:Assume fst=fs0t0,if more users in(s,t

    10、),thenxst xs0t0Imprecise location:Posts without location stamp,noisy user profilelocationZero/Low counts:If we dont see tweets from Antarctica,no penguinsthere?Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 20139/95Machine Learning BasicsOutline1234Spatio-Temporal Signal Recovery from So

    11、cial MediaMachine Learning BasicsProbabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochastic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201310/95Machine Learning BasicsProbabilit

    12、yOutline1234Spatio-Temporal Signal Recovery from Social MediaMachine Learning BasicsProbabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochastic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Understanding Social MediaCCF/ADL

    13、 Beijing 201311/95Machine Learning BasicsProbabilityProbabilityThe probability of a discrete random variable A taking the value a isP(A=a)2 0,1.Sometimes written as P(a)when no danger of confusion.NormalizationJoint probability P(A=a,B=b)=P(a,b),the two events bothhappen at the same time.Marginaliza

    14、tion P(A=a)=B”.P(a,b)The product rule P(a,b)=P(a)P(b|a)=P(b)P(a|b).Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201312/95Bayes rule P(a|b)=P(b|a)P(a).In general,P(a|b,C)=P(b|C)Rp(D|)p()d the evidence,Machine Learning BasicsProbabilityBayes RuleP(b)P(b|a,C)P(a|C)where C can be one or mo

    15、rerandom variables.Bayesian approach:when is model parameter,D is observed data,we havep(|D)=p(D|)p()p(D),Rp(D|)d 6=1),IIIIp()is the prior,p(D|)the likelihood function(of,not normalized:p(D)=p(|D)the posterior.Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201313/95Machine Learning Basic

    16、sProbabilityIndependenceThe product rule can be simplified as P(a,b)=P(a)P(b)i A andB are independentEquivalently,P(a|b)=P(a),P(b|a)=P(b).Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201314/95R x2P(x1 X 1 is possible!Integrates to 1.x1Marginalization p(x)=p(x)dx=11p(x)dx1 p(x,y)dyZhu (

    17、U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201315/95pMachine Learning BasicsProbabilityExpectation and VarianceThe expectation(“mean”or“average”)of a function f under theprobability distribution P isEPf=P(a)f(a)aEpf=p(x)f(x)dxxIn particular if f(x)=x,this is the mean of the random variabl

    18、e x.The variance of f isVar(f)=E(f(x)Ef(x)2=Ef(x)2Ef(x)2The standard deviation is std(f)=Var(f).Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201316/95Machine Learning BasicsProbabilityMultivariate StatisticsWhen x,y are vectors,Ex is the mean vectorCov(x,y)is the covariance matrix with

    19、 i,j-th entry beingCov(xi,yj).Cov(x,y)=Ex,y(xEx)(yEy)=Ex,yxyExEyZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201317/958 (d-sided die)f(x)=nx1,.,xdxk B(1)(Marginal)x N(x,A)(Conditional)y|x N(y+CA1(xx),BCA1C)Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201322/95Machine Lear

    20、ning BasicsProbabilityMore Continuous Distributions0with 0.Generalizes factorial:(n)=(n 1)!when n is a positive integer.(+1)=()for 0.parameter 0 and scale parameter 0f(x)=1()x1ex/,x 0.Conjugate prior for Poisson rate.Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201323/95Machine Learnin

    21、g BasicsStatistical EstimationOutline1234Spatio-Temporal Signal Recovery from Social MediaMachine Learning BasicsProbabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochastic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Unde

    22、rstanding Social MediaCCF/ADL Beijing 201324/95Machine Learning BasicsStatistical EstimationParametric ModelsA statistical model H is a set of distributions.In machine learning,we call H the hypothesis space.A parametric model can be parametrized by a finite number ofparameters:f(x)f(x;)with paramet

    23、er 2 Rd:H=f(x;):2 Rdwhere is the parameter space.Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201325/95Statistical EstimationMachine Learning BasicsParametric ModelsWe denote the expectationE(g)=Zxg(x)f(x;)dxE means Ex f(x;),not over di erent s.data1All(parametric)models are wrong.Some

    24、 are more useful than others.Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201326/95Machine Learning BasicsStatistical EstimationNonparametric modelA nonparametric model cannot be parametrized by a fixed number ofparameters.Model complexity grows indefinitely with sample sizeExample:H=P

    25、 :V arP(X)0,favor homogeneous chainsWhen the parameter a 1(X)Multivariate Gaussian1xi,xj are conditionally independent given all other variables,if andonly if ij=0When ij 6=0,there is an edge between xi,xjZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201362/95Machine Learning BasicsGrap

    26、hical ModelsConditional Independence in Markov Random FieldsTwo group of variables A,B are conditionally independent givenanother group C,if:A,B become disconnected by removing C and all edges involving CACBZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201363/95Machine Learning BasicsRe

    27、gularizationOutline1234Spatio-Temporal Signal Recovery from Social MediaMachine Learning BasicsProbabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochastic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Understanding Social M

    28、ediaCCF/ADL Beijing 201364/95Machine Learning BasicsRegularizationRegularization for Maximum LikelihoodCan overfit.Regularized likelihoodn()+()()is the regularizer,for example ()=kk2.Coincides with MAP estimate with prior distributionp()/exp()Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijin

    29、g 201365/95kxiMachine Learning BasicsRegularizationGraph-based regularizationNodes:x1.xn,=f=(f(x1),.,f(xn)Edges:similarity weights computed from features,e.g.,IIk-nearest-neighbor graph,unweighted(0,1 weights)fully connected graph,weight decays with distancexjk2/2Iw=exp-radius graphAssumption Nodes

    30、connected by heavy edge tend to have the samevalue.x1x3x2Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201366/95RegularizationMachine Learning BasicsGraph energyf incurs the energyXi jwij(f(xi)f(xj)2smooth f has small energyconstant f has zero energyZhu (U Wisconsin)Understanding Social

    31、 MediaCCF/ADL Beijing 201367/95Machine Learning BasicsRegularizationAn electric network interpretationEdges are resistors with conductance wijNodes clamped at voltages specified by fEnergy=heat generated by the network in unit time+1 voltRij=1wij10Zhu (U Wisconsin)Understanding Social MediaCCF/ADL B

    32、eijing 201368/95Machine Learning BasicsRegularizationThe graph LaplacianWe can express the energy of f in closed-form using the graph Laplacian.I symmetric,non-negativeDiagonal degree matrix D:Dii=j=1 WijGraph Laplacian matrix=DWThe energyXi jwij(f(xi)f(xj)2=f fZhu (U Wisconsin)Understanding Social

    33、MediaCCF/ADL Beijing 201369/95XGaussian likelihood yi=f(xi)+i where i N(0,2),andMachine Learning BasicsRegularizationGraph Laplacian as a RegularizerRegression problem with training data xi 2 Rd,yi 2 R,i=1.nAllow f(Xi)to be di erent from Yi,but penalize the di erence witha Gaussian log likelihoodReg

    34、ularizer (f)=f fminfni=1(f(xi)yi)2+f fEquivalent to MAP estimate withII1Gaussian Random Field prior p(f)=Z exp f fZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201370/95PnPnPnLf=graph regularizer fi=1 aiMachine Learning BasicsRegularizationGraph Spectrum and RegularizationAssumption:lab

    35、els are“smooth”on the graph,characterized by the graphspectrum(eigen-values/vectors(i,i)ni=1 of the Laplacian L):L=i=1 i i ia graph has k connected components if and only if 1=.=k=0.the corresponding eigenvectors are constant on individual connectedcomponents,and zero elsewhere.any f on the graph ca

    36、n be represented as f=i=1 ai i2ismooth function f uses smooth basis(those with smalli)Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201371/95Machine Learning BasicsRegularizationExample graph spectrumThe graphEigenvalues and eigenvectors of the graph Laplacian1=0.006=0.3811=1.7916=3.342

    37、=0.007=0.6612=2.2117=3.623=0.048=1.0013=2.6218=3.624=0.179=1.3814=2.6219=3.835=0.3810=1.3815=3.0020=3.96Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201372/95Machine Learning BasicsStochastic ProcessesOutline1234Spatio-Temporal Signal Recovery from Social MediaMachine Learning BasicsPr

    38、obabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochastic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201373/95Machine Learning BasicsStochastic ProcessesStochastic ProcessInfinit

    39、e collection of random variables indexed by a set x.x 2 R for“time”More generally,x 2 Rd(e.g.,space and time).Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201374/95Machine Learning BasicsStochastic ProcessesGaussian ProcessA stochastic process where any finite number of random variable

    40、shave a joint Gaussian distribution.Index x 2 Rdf(x)2 R:the random variable indexed by xmean function m(x)=Ef(x),e.g.m(x)=0,8xcovariance function(kernel)k(x,x0)=E(f(x)m(x)(f(x0)m(x0),e.g.RBF12 2Gaussian Process(GP)f(x)GP(m(x),k(x,x0)A draw from a GP is a function f over Rd.Zhu (U Wisconsin)Understan

    41、ding Social MediaCCF/ADL Beijing 201375/950Covariance matrix =AMachine Learning BasicsStochastic ProcessesGaussian Process vs.Gaussian DistributionFor any x1,.,xn,(f(x1),.,f(xn)N(,)Mean vector =(m(x1),.,m(xn)k(x1,x1).k(x1,xn).k(xn,x1).k(xn,xn)1Zhu (U Wisconsin)Understanding Social MediaCCF/ADL Beiji

    42、ng 201376/95RExpected number of events in region T Rd isT =Stochastic ProcessesMachine Learning BasicsPoisson ProcessRate function (x)Ihomogeneous Poisson Process:(x)=constant(x)dxI inhomogeneous Poisson Process:otherwiseTThe number of events in T follows a Poisson distributionP(N(T)=k)=eTkTk!Zhu (U

    43、 Wisconsin)Understanding Social MediaCCF/ADL Beijing 201377/95Machine Learning BasicsStochastic ProcessesCox ProcessDoubly stochastic processThe rate (x)itself sampled from another stochastic processEg,Sigmoidal Gaussian Cox Process Adams et al.ICML09(x)=(f(x)upper bound intensity11+ezlogistic“squas

    44、hing”(z)=f(x)GPZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201378/95Socioscope:A Probabilistic Model for Social MediaOutline1234Spatio-Temporal Signal Recovery from Social MediaMachine Learning BasicsProbabilityStatistical EstimationDecision TheoryGraphical ModelsRegularizationStochas

    45、tic ProcessesSocioscope:A Probabilistic Model for Social MediaCase Study:RoadkillZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201379/95Socioscope:A Probabilistic Model for Social MediaCorrecting Population BiasSocial media user activity intensity gstx Poisson(f,g)Link function(target p

    46、ost intensity)(f,g)=f gCount of all posts z Poisson(g)gst can be accurately recoveredZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201380/95Socioscope:A Probabilistic Model for Social MediaHandling Imprecise LocationReproduced from Vardi et al(1985),A statistical model for positronemiss

    47、ion tomographyZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201381/95Socioscope:A Probabilistic Model for Social MediaHandling Imprecise Location:TransitionZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201382/95Socioscope:A Probabilistic Model for Social MediaHandling Impre

    48、cise Location:TransitionZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201383/95Socioscope:A Probabilistic Model for Social MediaHandling Zero/Low CountsZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201384/95Socioscope:A Probabilistic Model for Social MediaThe Graphical Mode

    49、lZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201385/95XXSocioscope:A Probabilistic Model for Social MediaOptimization and Parameter Tuningmin2Rnmi=1hi)+()(xi loghij=logfjhi=nj=1Pij(j,j)Quasi-Newton method(BFGS)Cross-Validation:Data-based and objective approach toregularization;Sub-sam

    50、ple events from the total observationsZhu (U Wisconsin)Understanding Social MediaCCF/ADL Beijing 201386/95If x Poisson(h),then E(xhh)2=hSocioscope:A Probabilistic Model for Social MediaTheoretical ConsiderationHow many posts do we need to obtain reliable recovery?1 x1:more counts,lesserrorTheorem:Le

    展开阅读全文
    提示  163文库所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
    关于本文
    本文标题:用机器学习的方法理解社会媒体课件.ppt
    链接地址:https://www.163wenku.com/p-3272220.html

    Copyright@ 2017-2037 Www.163WenKu.Com  网站版权所有  |  资源地图   
    IPC备案号:蜀ICP备2021032737号  | 川公网安备 51099002000191号


    侵权投诉QQ:3464097650  资料上传QQ:3464097650
       


    【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。

    163文库