模式识别讲座PPT课件.ppt
- 【下载声明】
1. 本站全部试题类文档,若标题没写含答案,则无答案;标题注明含答案的文档,主观题也可能无答案。请谨慎下单,一旦售出,不予退换。
2. 本站全部PPT文档均不含视频和音频,PPT中出现的音频或视频标识(或文字)仅表示流程,实际无音频或视频文件。请谨慎下单,一旦售出,不予退换。
3. 本页资料《模式识别讲座PPT课件.ppt》由用户(三亚风情)主动上传,其收益全归该用户。163文库仅提供信息存储空间,仅对该用户上传内容的表现方式做保护处理,对上传内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!
4. 请根据预览情况,自愿下载本文。本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
5. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007及以上版本和PDF阅读器,压缩文件请下载最新的WinRAR软件解压。
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 模式识别 讲座 PPT 课件
- 资源描述:
-
1、1Pattern RecognitionNanyang Technological UniversityDr. Shi, DamingHarbin Engineering University标题添加点击此处输入相关文本内容点击此处输入相关文本内容总体概述点击此处输入相关文本内容标题添加点击此处输入相关文本内容3What is Pattern RecognitionnClassify raw data into the category of the pattern.nA branch of artificial intelligence concerned with the identifi
2、cation of visual or audio patterns by computers. For example character recognition, speech recognition, face recognition, etc.nTwo categories: syntactic (or structural) pattern recognition and statistical pattern recognitionIntroductionPattern Recognition= Pattern Classification45What is Pattern Rec
3、ognitionTraining PhaseTraining dataUnknown dataFeature ExtractionLearning (Feature selection, clustering, discriminant function generation, grammar parsing) Recognition (statistical, structural)ResultsRecognition PhaseKnowledge6What is Pattern RecognitionTraining PhaseTraining dataUnknown dataFeatur
4、e ExtractionLearning (Feature selection, clustering, discriminant function generation, grammar parsing) Recognition (statistical, structural)ResultsRecognition PhaseKnowledge7CategorisationnBased on Application AreasnFace RecognitionnSpeech RecognitionnCharacter Recognitionnetc, etcnBased on Decisio
5、n Making ApproachesnSyntactic Pattern RecognitionnStatistical Pattern RecognitionIntroduction8Syntactic Pattern RecognitionAny problem is described with formal language, and the solution is obtained through grammatical parsingIn Memory of Prof. FU, King-Sun and Prof. Shu WenhaoIntroduction9Statistic
6、al Pattern RecognitionIn the statistical approach, each pattern is viewed as a point in a multi-dimensional space. The decision boundaries are determined by the probability distribution of the patterns belonging to each class, which must either be specified or learned.Introduction10Scope of the Semi
7、narnModule 1 Distance-Based ClassificationnModule 2 Probabilistic ClassificationnModule 3 Linear Discriminant AnalysisnModule 4 Neural Networks for P.R.nModule 5 ClusteringnModule 6 Feature SelectionIntroduction11Module 1 Distance-Based ClassificationNanyang Technological UniversityDr. Shi, DamingHa
8、rbin Engineering UniversityPattern Recognition12OverviewnDistance based classification is the most common type of pattern recognition techniquenConcepts are a basis for other classification techniquesnFirst, a prototype is chosen through training to represent a classnThen, the distance is calculated
9、 from an unknown data to the class using the prototypeDistance-Based Classification13Classification by distanceObjects can be represented by vectors in a space.In training, we have the samples:In recognition, an unknown data is classified by distance:Distance-Based Classification14PrototypenTo find
10、the pattern-to-class distance, we need to use a class prototype (pattern):(1) Sample Mean. For class ci,(2) Most Typical Sample. chooseSuch thatis minimized.Distance-Based Classification15Prototype Nearest Neighbour(3) Nearest Neighbour. chooseSuch thatis minimized.Nearest neighbour prototypes are s
11、ensitive to noise and outliers in the training set.Distance-Based Classification16Prototype k-NN(4) k-Nearest Neighbours. K-NN is more robust against noise, but is more computationally expensive.The pattern y is classified in the class of its k nearest neighbours from the training samples. The chose
12、n distance determines how near is defined.Distance-Based Classification17Distance MeasuresnMost familiar distance metric is the Euclidean distancenAnother example is the Manhattan distance:nMany other distance measures Distance-Based Classification18Minimum Euclidean Distance (MED) ClassifierEquival
13、ently,19Decision BoundaryGiven a prototype and a distance metric, it is possible to find the decision boundary between classes.Linear boundaryNonlinear boundaryDecision Boundary = Discriminant FunctionDistance-Based Classificationlightnesslengthlightnesslength20ExampleDistance-Based Classification21
14、ExampleAny fish is a vector in the 2-dimensional space of width and lightness.21xxfishDistance-Based Classificationlightnesslength22ExampleDistance-Based Classification23SummarynClassification by the distance from an unknown data to class prototypes.nChoosing prototype:nSample MeannMost Typical Samp
15、lenNearest NeighbournK-Nearest NeighbournDecision Boundary = Discriminant FunctionDistance-Based Classification24Module 2 Probabilistic ClassificationNanyang Technological UniversityDr. Shi, DamingHarbin Engineering UniversityPattern Recognition25Review and Extend26Maximum A Posterior (MAP) Classifi
16、ernIdeally, we want to favour the class with the highest probability for the given pattern:Where P(Ci|x) is the a posterior probability of class Ci given x27Bayesian ClassificationnBayes Theoreom:Where P(x|Ci) is the class conditional probability density (p.d.f), which needs to be estimated from the
17、 available samples or otherwise assumed.Where P(Ci) is a priori probability of class Ci.Probabilistic Classification28MAP ClassifiernBayesian Classifier, also known as MAP ClassifierSo, assign the pattern x to the class with maximum weighted p.d.f.Probabilistic Classification29Accuracy VS. RiskHowev
18、er, in the real world, life is not just about accuracy. In some cases, a small misclassification may result in a big disaster. For example, medical diagnosis, fraud detection.The MAP classifier is biased towards the most likely class. maximum likelihood classification.Probabilistic Classification30L
19、oss FunctionOn the other hand, in the case of P(C1) P(C2), the lowest error rate can be attained by always classifying as C1A solution is to assign a loss to misclassification.which leads to Also known as the problem of imbalanced training data.Probabilistic Classification31Conditional RiskInstead o
20、f using the likelihood P(Ci|x), we use conditional riskcost of action i given class j To minimize overall risk, choose the action with the lowest risk for the pattern:Probabilistic Classification32Conditional RiskProbabilistic Classification33ExampleAssuming that the amount of fraudulent activity is
21、 about1% of the total credit card activity:C1 = Fraud P(C1) = 0.01C2 = No fraud P(C2) = 0.99If losses are equal for misclassification, then:Probabilistic Classification34ExampleHowever, losses are probably not the same. Classifying a fraudulent transaction as legitimate leads to direct dollar losses
22、 as well as intangible losses (e.g. reputation, hassles for consumers).Classifying a legitimate transaction as fraudulent inconveniences consumers, as their purchases are denied. This could lead to loss of future business.Lets assume that the ratio of loss for not fraud to fraud is 1 to 50, i.e., A
展开阅读全文