书签 分享 收藏 举报 版权申诉 / 17
上传文档赚钱

类型研究每个人的单一核甘酸多型性(SNP)的差异课件.ppt

  • 上传人(卖家):ziliao2023
  • 文档编号:5895251
  • 上传时间:2023-05-14
  • 格式:PPT
  • 页数:17
  • 大小:75KB
  • 【下载声明】
    1. 本站全部试题类文档,若标题没写含答案,则无答案;标题注明含答案的文档,主观题也可能无答案。请谨慎下单,一旦售出,不予退换。
    2. 本站全部PPT文档均不含视频和音频,PPT中出现的音频或视频标识(或文字)仅表示流程,实际无音频或视频文件。请谨慎下单,一旦售出,不予退换。
    3. 本页资料《研究每个人的单一核甘酸多型性(SNP)的差异课件.ppt》由用户(ziliao2023)主动上传,其收益全归该用户。163文库仅提供信息存储空间,仅对该用户上传内容的表现方式做保护处理,对上传内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!
    4. 请根据预览情况,自愿下载本文。本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
    5. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007及以上版本和PDF阅读器,压缩文件请下载最新的WinRAR软件解压。
    配套讲稿:

    如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。

    特殊限制:

    部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。

    关 键  词:
    研究 每个人 单一 核甘酸多型性 SNP 差异 课件
    资源描述:

    1、Application of Support Vector Machine to detect an association between a disease or trait and multiple SNP variationsAuthor:Gene Kim,MyungHo KimAdvisor:Dr.HsuGraduate:Ching-Wen HongOutline 1.Motivation 2.Objective 3.Whats SNP(single nucleotide polymorphism)4.How to find SNP variations 5.A review of

    2、Support Vector Machine 6.A representation of multiple SNP variations as a vector 7.The marks 8.Inseparable Case 9.Test results with clinical data 10.Personal opinionMotivation 研究每個人的單一核甘酸多型性(SNP)的差異,可以幫助了解致病基因,甚至預測藥物對個人是否具有療效,進一步設計量身訂做藥物,對新藥的開發有極大的影響。SNP的研究是後基因時代生技產業發展的主要趨勢。Objective We can present

    3、a method of detecting whether there is an association between multiple SNP variations and a trait or disease.The method exploits the Support Vector Machine(SVM)which has been attracting lots of attentions recently.Whats SNP 何謂SNP(單一核甘酸多型性)雖然同種生物其染色體差異極小,但平均1000個鹼基對(base pair)就有一個發生突變,這些變異稱為SNP,是造成每個

    4、人對藥物的敏感性不同、血型不同、身高 等等的原因。此外,SNP也和癌症、心血管疾病、自體免疫等等疾病有關。目前國內賽亞基因和台大醫院合作,正從事C型肝炎SNP研究,試圖找出病患的SNP,以預測藥物是否對病人有效。Whats SNP A genetic marker is M1,M2,in the DNA The different variants of DNA that different people have at the marker are alleles,denoted by 1,2,3.,The number of alleles per marker is small:typi

    5、cally less than ten(for called microsatellite marker)or exactly two(for called SNPs).How to find SNP variations The problem of determining whether a set of SNP variation cause a specific disease or trait could be formulated as follows.For a given disease or trait,1.For each set of SNP variations,fin

    6、d its representation as a vector in a Euclidean space.(haplotype data,clinical data,.we will discuss this in the page9)2.Get a systematic way of distinguishing SNP genotype of normal people from ones of people with the disease or trait.We will use the Support Vector Machine(SVM)to separate SNP vecto

    7、rs into two groups(normal,sick).A review of Support Vector Machine What is a SVM?a family of learning algorithm for classification of objects into two classes.Input:a training set(x1,y1),(xl,yl)of object xi E(n-dim vector space)and their known classes yi E-1,+1.Output:a classifier f:-1,+1.which pred

    8、icts the class f(x)for any(new)object x E A review of Support Vector Machine(1).Linear SVM for separable training sets:a training set S=(x1,y1),(xl,yl),xiE,yi E-1,+1.A review of Support Vector Machine The optimal hyperplane is defined by the pair(w,b).Solve the linear program problem Min w st.yi(xiw

    9、+b)-10 ,i=1,l This is a class quadratic(convex)programA review of Support Vector Machine(2).Linear SVM for non-separable training sets Solve the linear program problem Min w+C(i),c is a extreme large value S.t.yi(xiw+b)-1+i 0 ,i 0,0ic,i=1,lA representation of multiple SNP variations as a vector Sche

    10、me Given each disease or trait,and a collection of SNP data which depending on genotype in a consistent way.(haplotype,clinical data):7 step 1.Assume that there is no environmental factor.2.SNP locations are assumed to be know for the disease or trait.3.Assume there is a reference SNP data.(good hea

    11、lth records)4.By giving scores based on difference from the reference data,assign a vector to each SNP data.A representation of multiple SNP variations as a vector The dimension of vector is the number of SNPs to the related disease or trait.5.A training set is chosen for the disease or trait,in oth

    12、er words,SNP genotype data of normal and sick population.6.By using Step 4,compute the SNP vectors of the training data set(xi,yi),xi is a SNP data,yi=1(sick)or -1(normal),7.Use the SVM to get a hyperplane dividing into two groups(sick,normal)The remarks1.The reference data can be built by collectin

    13、g SNP genotypes from the healthy normal population.2.The hyperplane obatined can be considered as acriterion,and,given a new data set,it can be used for testing whether the person of the data is susceptible to the disease or trait.3.Representation of an object as a vector might be critical for makin

    14、g use the SVM.How to make domain knowledge contained in vector representations is one of the major issues.4.The idea of difference scoring could be applied to other data sets(visual data such as X-ray or MRI image,),in particular,to haplotype data and to find out a linkage among SNP to the disease o

    15、r trait.5.Once a group of SNP patterns are identified,it can compute contribution score of each of those SNP to the disease or trait.Inseparable CaseFor the inseparable case,the iterated use of SVM enables us to divide a collection of labelled of vectors into several clustering groups.1.Set a thresh

    16、old value.Say,80%.2.Use SVM to separate a collection of labelled of vectors into two groups A,B.3.Check if the groups contain more than 80%of either 1 or-1 labeled vectors.Suppose A is not such one.Then use SVM to A again to two subgroups.4.Repeat this procedure until each subgroup has a majority of

    17、 more than 80%.5.For each subgroup,figure out a range.Test results with clinical dataThe clinical data is a cardio-patient records data set:Height,age,sex,weight,etnic background,medical history,birth place,blood pressure(systolic and diastolic),Liqid measurements etc are numericalized and+1:a patient with heart attack,stroke or heart failure,otherwise-1We used Thorsten Joachims implementation of SVM.17Personal opinion Application of SVM is effective,But it is difficult to solve nonlinear problem.How to make domain knowledge contained in vector representations is one of the major issues.

    展开阅读全文
    提示  163文库所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
    关于本文
    本文标题:研究每个人的单一核甘酸多型性(SNP)的差异课件.ppt
    链接地址:https://www.163wenku.com/p-5895251.html

    Copyright@ 2017-2037 Www.163WenKu.Com  网站版权所有  |  资源地图   
    IPC备案号:蜀ICP备2021032737号  | 川公网安备 51099002000191号


    侵权投诉QQ:3464097650  资料上传QQ:3464097650
       


    【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。

    163文库