书签 分享 收藏 举报 版权申诉 / 88
上传文档赚钱

类型MOE-QSAR课件.ppt

  • 上传人(卖家):三亚风情
  • 文档编号:3573282
  • 上传时间:2022-09-19
  • 格式:PPT
  • 页数:88
  • 大小:1.94MB
  • 【下载声明】
    1. 本站全部试题类文档,若标题没写含答案,则无答案;标题注明含答案的文档,主观题也可能无答案。请谨慎下单,一旦售出,不予退换。
    2. 本站全部PPT文档均不含视频和音频,PPT中出现的音频或视频标识(或文字)仅表示流程,实际无音频或视频文件。请谨慎下单,一旦售出,不予退换。
    3. 本页资料《MOE-QSAR课件.ppt》由用户(三亚风情)主动上传,其收益全归该用户。163文库仅提供信息存储空间,仅对该用户上传内容的表现方式做保护处理,对上传内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!
    4. 请根据预览情况,自愿下载本文。本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
    5. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007及以上版本和PDF阅读器,压缩文件请下载最新的WinRAR软件解压。
    配套讲稿:

    如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。

    特殊限制:

    部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。

    关 键  词:
    MOE_QSAR 课件
    资源描述:

    1、InforSense&CCG All Rights ReservedAdvanced Application in MOEQSARInforSense&CCG All Rights ReservedOutlineQSAR OverviewDescriptor calculationDescriptor selection(PCA)Deriving QSAR modelsModel ValidationInforSense&CCG All Rights ReservedQSARQuantitative Structure-Activity Relationship(QSAR)applicatio

    2、ns correlate experimental data(e.g.biological activity or physical properties)with the structure of chemical compounds in a quantitative manner.QSAR models allow the interpretation and prediction of properties of structurally related compounds.*)The art of deriving a QSAR model lies in:Identifying a

    3、 suitable mathematical functional form Reducing the complex dimensionality of reality into as few dimensions as possible while still being able to give useful predictions of specific properties for molecules not experimentally tested so far.Most QSAR models are based on linear correlations.InforSens

    4、e&CCG All Rights ReservedQSAR Model DevelopmentRobust QSAR model development generally proceeds as follows:Assemble a database of experimental results and molecular structures.Identify a descriptor set that correlates highly with the property in question,use descriptors which are mutually orthogonal

    5、 and as meaningful and intuitive as possible(based on the underlying physico-chemical properties).Split the dataset into an appropriate training and test set.The training set will be used to develop the model.The test set will be used to validate the predictive power of the model.In most cases the a

    6、pplicability of the model will be closely limited to the property space of the test set.Apply methods(regression,classification,etc.)to generate the predictive models based on the training set.Predict activities for the test set to assess robustness of the model.Descriptor calculation(QuaSAR-Descrip

    7、tor)Descriptor selection(Principle Components,QuaSAR-Contingency)Modelvalidation(Model-Evaluate)Modeldevelopment(QuaSAR-Model,Model-Composer)InforSense&CCG All Rights ReservedQuantitative&Qualitative QSAR Models in MOEBesides the selection of most appropriate descriptors and a meaningful separation

    8、of available data into training and test sets,the choice of an appropriate functional form is key to successful QSAR modeling.MOE provides quantitative as well as qualitative QSAR approaches:Quantitative approaches include linear regression methods such as Partial Least Squares(PLS)and Principal Com

    9、ponent Regression(PCR).Qualitative approaches include a non-linear binary filter based on Bayesian statistics as well as a binary classification tree.InforSense&CCG All Rights ReservedModeldevelopmentModelvalidationDescriptor selectionDescriptorcalculationDescriptor CalculationInforSense&CCG All Rig

    10、hts ReservedInitial Steps in Understanding a DatasetThe initial steps in interpreting an experimental dataset involve:Building preliminary Structure Activity RelationshipsCommon fragments to actives/inactives Looking for patterns within the data structureAre clusters present in the data?Evaluating t

    11、he relative importance of descriptors for a potential modelInvolves both stochastic and heuristic evaluation Finding commonality,and diversity within the dataRobustness in chemical spaceStep 1 in data analysis:Find the relevant set of descriptorsInforSense&CCG All Rights ReservedMolecular Descriptor

    12、s and FingerprintsMolecular Descriptors encode molecular properties per molecule into single numerical values.Qualitative:yes/no flags for presence or absence of certain features(like bits in fingerprints see below).Quantitative:numerical measures of physico-chemical or structural properties.May dep

    13、endent on connectivity and chemistry only(2D)or also on conformation/3D geometry(3D).Fingerprints typically consists of bit strings of several hundreds or even thousands of individual yes/no flags.Each position of the bit string encodes the presence(1)or absence(0)of a distinct property or feature.I

    14、ncluding substructure fragments,connectivity patterns or pharmacophore type functional properties.010010100101001.bit stringBrSNCH3OOONH2Molecular weight:385.282logP:2.552#rotatable single bonds:5or2,5,7,10,12,15,bit positionInforSense&CCG All Rights ReservedQuaSAR Descriptor PanelNumerical molecula

    15、r descriptors may be calculated either via(MOE|Compute|QuaSAR|QuaSAR-Descriptor)without opening a database or via(DBV|Compute|Descriptors).Input databaseDescriptor synchronization with databaseDescriptor listDisplay filtersInforSense&CCG All Rights ReservedOverview of MOE Descriptors300 2D and 3D de

    16、scriptors Topological indices Surface area properties Physical properties Energy termsAdd new descriptors with SVL Automatically added to relevant calculations Existing descriptors can be used as templateProprietary VSA descriptors Subdivision of surface area based on LogP,MR(molar refractivity)and

    17、Partial Charge 2D based approximation(for speed on large datasets)Semi empirical descriptors Descriptor names prefixed with Hamiltonian:AM1_,PM3_,MNDO_ Total energy,electronic energy,heat of formation,HOMO,LUMO,Ionization PotentialInforSense&CCG All Rights ReservedBinned VSA Descriptors I A subset o

    18、f highly uncorrelated,intuitive and meaningful 2D descriptors has been implemented in MOE to provide a stable“default”approach for new datasets:the binned Van-der-Waals surface area descriptors(referred to as binned VSA descriptors in MOE)1).LogP(partition coefficient),MR(molar refractivity)and part

    19、ial charge are used to cover a meaningful property space from hydrophobic to hydrophilic interactions.Each of these descriptor sets is derived from,or related to the Hansch汉施 and Leo descriptors.2)The descriptor returns the approximate surface area of a molecule,produced from a 2D representation,tha

    20、t falls into a given range of property values.Using the subset of binned VSA descriptors may help to overcome the necessity of using automatic descriptor selection routines.3)InforSense&CCG All Rights ReservedBinned VSA Descriptors IIThe surface contribution which may be sensed by neighboring molecu

    21、les is approximated by subtracting overlapping surface areas from first shell atom neighbors.The 2019 Wildman&Crippen1)atom type model is used to map properties onto individual atoms.Contributions to LogP and MR are derived in linear models from datasets of about 10,000 experimental data points each

    22、2).For partial charge calculation,the Gasteiger PEOE charges is used.The approximate surface area contributions of a given molecule are added for each property bin.3)Vi values:V7 V2 V1 V6 V3 V4+V8+V5Pi range:0,1)1,2)2,3)3,4)4,5)5,6)6 Descriptors:D1D2D3D4D5D6C8C3C4C5C6N7O2C1InforSense&CCG All Rights

    23、Reserved2D BCUT and GCUT Descriptors BCUT:Burden Matrix eigenvalues The BCUT descriptors*)are calculated from the eigenvalues of a modified adjacency matrix.The adjacency matrix contains a 1 if atoms i,j are bonded;0 otherwise.Each ij entry of the adjacency matrix takes the value bij-1/2 where bij i

    24、s the formal bond order between bonded atoms i and j.The diagonal takes the value of the associated PEOE,SMR,logP descriptor.The resulting eigenvalues are sorted and the smallest,1/3 percentile,2/3 percentile and largest eigenvalues are reported.GCUT:Inverse graph distance matrix eigenvalues The GCU

    25、T descriptors are calculated from the eigenvalues of a modified graph distance adjacency matrix,similar to BCUT descriptors.Each ij entry of the adjacency matrix takes the value dij-2 where dij is the(modified)graph distance between atoms i and j.The diagonal takes the value of the associated PEOE p

    26、artial charges,SMR or logP descriptors.The resulting eigenvalues are sorted and the smallest,1/3 percentile,2/3 percentile and largest eigenvalues are reported.InforSense&CCG All Rights ReservedCaveats in Descriptor CalculationTo ensure consistent i3D and x3D descriptor values if starting from 2D st

    27、ructures without hydrogens,the following procedure should be used:Via the DBV:1.Import the structures without adding hydrogens 2.Energy minimize the database enabling the following options:-“Rebuild 3D”-“Add Hydrogens”-“Calculate forcefield partial charges”In the Command Line via sdproc,which adds h

    28、ydrogens,calculates partial charges,performs energy minimization,and descriptor calculation in a single pass.Note:Differences may arise when SMILES structures are used as a molecular source random initial coordinates.Hydrogens,partial charge,and energy minimization steps are performed in series coor

    29、dinate truncation errors InforSense&CCG All Rights ReservedExercise:Descriptor CalculationDescriptor selection depends on the experience of the user.TPSA is used to consider the molecule size and electrostatic interaction,SlogP is used for the permeability,and SMR for polarization.Correlation betwee

    30、n the 3 descriptors is plotted.1.Open the merged_bb.mdb file,and save a local copy to the working directory.*)2.Open the QuaSAR-Descriptor panel(DBV|Compute|Descriptors).A list of the built-in descriptors is displayed,which can be navigated using text filters.3.Enter TPSA in the Descriptor Filter.4.

    31、Left mouse click once to select the TPSA descriptor in the descriptor list.InforSense&CCG All Rights ReservedExercise:Descriptor Calculation5.Enter SMR in Descriptor Filter and select the SMR descriptor from the filtered list.6.Enter SlogP in Descriptor Filter and select SlogP from the filtered list

    32、.7.Press OK to calculate the three selected descriptors.InforSense&CCG All Rights ReservedExercise:Descriptor CalculationCheck descriptor correlations:8.Select the activity field(logBB)and the three descriptor fields in the database(SlogP,SMR,TPSA)InforSense&CCG All Rights ReservedDescriptor Calcula

    33、tion:CorrelationThe relationship between two variables X and Y is described by the correlation coefficient R.This is determined by linear regression analysis(see QSAR models),where a linear equation that has the smallest x and y values of all data points is derived.The correlation coefficient is cal

    34、culated by:A correlation coefficient of 1 indicates a perfect correlation,-1 being inversely correlated and 0 being unrelated.*)yxyxxy niii 1nn22iii 1i 1xxyycov X,YRvar Xvar YxxyyR=1.00R=-0.72R=-0.06R=0.95R=0.77InforSense&CCG All Rights ReservedCorrelation Between Stork Populations and Human Birthra

    35、tes(H.Sies,Nature,332(1988)495)Any correlation between descriptors and experimental data has to be meaningful mechanistically.1965 1967 1969 1971 1973 1975 1977 1979 1981Year500700900110013001500170019002100AmountStorksBabiesInforSense&CCG All Rights ReservedExercise:Descriptor Calculation-Correlati

    36、on MatrixModels will be more robust if uncorrelated descriptors are used*).Correlation can be inspected using either a correlation plot or a matrix.1.Select(DBV|Compute|Analysis|Correlation Matrix).The numbers in the icons in the correlation matrix correspond to percent correlation.2.Double-Click on

    37、 the highlighted cell to bring up the correlation plot(or by(DBV|Compute|Analysis|Correlation Plot)and selecting two numeric fields).InforSense&CCG All Rights ReservedExercise:Descriptor Calculation-Correlation PlotA correlation coefficient(R2)of 0.0756 and the linear regression equation are indicat

    38、ed in the header line of the correlation plot.There is virtually no correlation between SlogP and TPSA.3.Select e.g.active compounds(logBB 0.5)in the DBV or any data points in the plot(Left mouse drag over selection).The selection is interactive between the plot and the database viewer.To deselect e

    39、ntries,use the(DBV|Entry|Clear Entry Selection)menu,the Entry Popup menu or the Clear Selection button in the DBV plot.InforSense&CCG All Rights ReservedExercise:Descriptor Calculation-Correlation PlotA correlation coefficient(R2)of 0.0756 and the linear regression equation are indicated in the head

    40、er line of the correlation plot.There is virtually no correlation between SlogP and TPSA.3.Select e.g.active compounds(logBB 0.5)in the DBV or any data points in the plot(Left mouse drag over selection).The selection is interactive between the plot and the database viewer.To deselect entries,use the

    41、(DBV|Entry|Clear Entry Selection)menu,the Entry Popup menu or the Clear Selection button in the DBV plot.InforSense&CCG All Rights ReservedExercise:Descriptor Calculation-Correlation PlotDisplay attributes may be modified and data exported to other tools.4.Clear the selection using Clear Selection b

    42、utton in the Plot5.Select Data to Clipboard to copy the XY values e.g.into a text editor,or to import the data into Excel.6.Select Attributes to change to a white background,black foreground,black markers,etc.*)InforSense&CCG All Rights ReservedDescriptor SelectionDescriptorcalculationModeldevelopme

    43、ntModelvalidationDescriptor selectionInforSense&CCG All Rights ReservedDescriptor SelectionIn the preceding example one of the three descriptors(SMR)shows low relationship to logBB.In practice,many descriptors(some correlated,some not)are calculated and used as starting point to build a QSAR model.T

    44、here are two approaches in the development of robust QSAR models:Descriptor reduction:Select calculated descriptors which are not or which are only weakly correlated(orthogonal).Either manually or semi-automatic by QuaSAR-Contingency.Dimension reduction:Use all calculated(possibly correlated)descrip

    45、tors in a Principal Component Analysis(PCA).InforSense&CCG All Rights ReservedDescriptor Selection:QuaSAR-ContingencyQuaSAR-Contingency(DBV|Compute|QuaSAR-Contingency)is a statistical application to assist in the selection of descriptors for QSAR or QSPR.The application performs a bivariate continge

    46、ncy analysis for each descriptor and the activity or property value.It produces a table of coefficients that helps to select important descriptors.Input databasePredictable propertyDescriptor listInforSense&CCG All Rights ReservedExercise:QuaSAR-ContingencyDetermine the most(un)important descriptors

    47、 for the merged_bb.mdb dataset.1.Open the QuaSAR-Contingency panel(DBV|Compute|QuaSAR-Contingency).2.Select the 3 descriptors(SlogP,SMR,and TPSA)and press OK.3.Examine the result in the text editor.SlogP is considered as the most unimportant descriptor Contingency measuresDescriptor dependenceMost i

    48、mportant descriptorsInforSense&CCG All Rights ReservedPrincipal Components Analysis(PCA)PCA reduces the dimensionality of a set of molecular descriptors by linearly transforming the data such that all components remain orthogonal.The 1st PC describes the direction of greatest data varianceThe 2nd PC

    49、 describes the direction of the second greatest data variance etc.Descriptor 1Descriptor 2Descriptor 3PC 1PC 2InforSense&CCG All Rights ReservedPCA Pre-ProcessingSince descriptors may be heterogeneous in nature(units,scale,etc.),the data should be pre-processed to build meaningful models.PCA is gene

    50、rally applied to scaled and/or mean centered data.Scaling:Usually appropriate in systems where the variables have different units and/or cover different magnitudes,e.g.variation between 100-110 C and 0.01-0.1 M.Puts all descriptors on an equal basis in the analysisMean centering:Translates the origi

    展开阅读全文
    提示  163文库所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
    关于本文
    本文标题:MOE-QSAR课件.ppt
    链接地址:https://www.163wenku.com/p-3573282.html

    Copyright@ 2017-2037 Www.163WenKu.Com  网站版权所有  |  资源地图   
    IPC备案号:蜀ICP备2021032737号  | 川公网安备 51099002000191号


    侵权投诉QQ:3464097650  资料上传QQ:3464097650
       


    【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。

    163文库