书签 分享 收藏 举报 版权申诉 / 42
上传文档赚钱

类型大质量比双黑洞给数值相对论带来的挑战课件.ppt

  • 上传人(卖家):三亚风情
  • 文档编号:3494985
  • 上传时间:2022-09-07
  • 格式:PPT
  • 页数:42
  • 大小:4.06MB
  • 【下载声明】
    1. 本站全部试题类文档,若标题没写含答案,则无答案;标题注明含答案的文档,主观题也可能无答案。请谨慎下单,一旦售出,不予退换。
    2. 本站全部PPT文档均不含视频和音频,PPT中出现的音频或视频标识(或文字)仅表示流程,实际无音频或视频文件。请谨慎下单,一旦售出,不予退换。
    3. 本页资料《大质量比双黑洞给数值相对论带来的挑战课件.ppt》由用户(三亚风情)主动上传,其收益全归该用户。163文库仅提供信息存储空间,仅对该用户上传内容的表现方式做保护处理,对上传内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!
    4. 请根据预览情况,自愿下载本文。本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
    5. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007及以上版本和PDF阅读器,压缩文件请下载最新的WinRAR软件解压。
    配套讲稿:

    如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。

    特殊限制:

    部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。

    关 键  词:
    质量比 黑洞 数值 相对论 带来 挑战 课件
    资源描述:

    1、Port AMSS-NCKU code to GPU Zhoujian Cao Academy of Mathematics and System Science,CAS Cowork with Zhihui Du,Steven Brandt,Frank Loeffler and Quan Yang 2013-8-72013 International School on Numerical Relativity and Gravitational Waves,Pohang KoreaOutline Motivations from gravitational wave detection N

    2、ew parallel mesh refinement numerical scheme GPU acceleration for NR SummaryThe most stringent test of GR the anomalous precession of the perihelion of Mercury(1915,v )Deflection of Starlight(1919,v )gravitational redshift (1965,v )gravitational time delay effect(1968,v )Evidence of Gravitational Wa

    3、ves(1978,v )frame-dragging effect(2010,v )Direct gravitational wave detection(?,v1)1010710104GR=Newtonian Gravity+PN(v)+PN(v2)+101076107101076102Gravitational wave astronomy Search back to extremely early universe Hear the dark universe Gravitational wave and its detectionCategory of Black Holes Sup

    4、er massive black hole:M:105109 Msun Stellar massive black hole:M:1-10s Msun Intermediate massive black hole:M:10s105 Msun(mainly in globular cluster)Farrell,et al,Nature 460(2009)73;Feng,et al,New Astronomy Reviews 55(2011)166Category of Black Holes BinaryIMBHALIAXuefei Gong,et al,CQG 28,094012(2011

    5、)1:10001:1Advanced LIGOAbadie,et al,PRD 85,102004(2012)IMBH and GW detectionData analysis and templateRef to Sang Hoon Ohs lectureTemplate model for BBH?Yi Pans talk,2013Template model for BBH PN templates:for early stage of inspiralling EOBNR(effective one body model together with numerical relativ

    6、ity):for full inspiral+merger+ring down stage;works well for mass ratio less than 1:8 and extreme mass ratio BBH,high spinning,precession!But no reliable template for mass ratio 1:10 to 1:100From a given separation of the two BHs,when mass ratio increases the number of orbit increases quickly.This r

    7、equires that the numerical simulation with full GR increases much consequently.In contrast to 1:1,1:100 needs 10 times more computation cost.PN estimationComputational cost1:1,9 days1:100,20 daysLSSC cluster II,128 CPUs,for last 2 orbits computational cost 1 to 20!Challenge of large mass BBH to NR C

    8、ompared to 1:1,the computational cost of 1:100 BBH increase roughly 200 times!For typical simulation of 1:1 BBH,14 days are needed.So by straight forward method to 1:100,roughly 1year is needed!Possible ways out 1.Physical level:approximation method,such as self force frame work(but still first orde

    9、r yet),2.Numerical Algorithm level:implicit scheme R.Lau et al,PRD 84,084023(2011),combine Cauchy evolution to null evolution,3.Computer level:improve scalability to use more CPUs,use GPU,Possible ways out 1.Physical level:approximation method,such as self force frame work(but still first order yet)

    10、,2.Numerical Algorithm level:implicit scheme R.Lau et al,PRD 84,084023(2011),combine Cauchy evolution to null evolution,3.Computer level:improve scalability to use more CPUs,use GPU,Mesh refinement schemeHigh resolution mesh grids for region near BH,while low resolution mesh grids for far regionMesh

    11、 refinement in CFDResult based on PARAMESHPARAMESHGrACEJASMINComparison of NR and CFD NR(only for BH):computational expensive on single grid point,but functions quite smooth few grid points(handrads),high order finite difference CFD:computation on single point is cheap,but fluid dynamics quite compl

    12、ex(compare the lectures on HD)grid number is quite large(millions)Mesh refinement schemeScheme adopted by PARAMESHLevel 0Level 1Mesh refinement schemeScheme adopted by PARAMESHLevel 0Level 1tx10ttMesh refinement schemeScheme for NRLevel 0Level 1Distribute data along one level to available processesM

    13、esh refinement schemeScheme for NRF.Loeffler et al,CQG 29,115001(2012)Level 0Level 1102 ttLS schemeMesh refinement schemeParallelization limit:200 x200 x2006th order finite difference(8 ghost points for two sides)processesHow about distribute data on all levels and calculate them parallely?3/15.08nn

    14、32n256Parallel mesh level algorithmPX scheme:distribute data on all levels to all processes;calculate parallelyMesh refinement scheme Procs for lev0 procs for lev1 procs for lev2 run run run wait wait run wait run run wait wait run run run run Strong scalling property due to more data to distribute;

    15、Resource wasting(Lx procs of LS)due to waiting!Calculation speed:2 times faster!timeParallel mesh level algorithmP2 scheme:distribute data on finest level to half processes and distribute data on other levels along the same level to another half processes;calculate parallely for finest level and oth

    16、er levels,while sequentially for other levelslev0lev2lev1Mesh refinement scheme Procs for lower levels procs for lev2 lev1 run lev0 run lev1 run wait run lev1 run Scalling property is weaker than PX;Less waiting(2x procs LS)!Calculation speed:2 times faster!timeComparison to LS schememore complicate

    17、 casetxlev0lev1lev2 Now,procs for finest level have to wait!more complicate casetxlev0lev1lev2GPU accelerationFor system biology,Yamazaki,Igarashi,Neural Networks,2013For GW data analysis,Zhihui Du,et al,CQG 29,235018(2012)Put RHS calculation to GPU For AMSS-NCKU code,time for RHS calculation 80%RHS

    18、 function involves too many variables,even only transform their addresses are time consuming So pack these addresses and store it in constant memory(do not transform any more during evolution),save shared memory at the same timePut RHS calculation to GPU Keep the data on GPU till MPI data transfer b

    19、etween different processes Using buffer point method to reduce MPI transfer for RK4 from 4 times to only 1 time;also reduce data transfer times between GPU and CPUPut RHS calculation to GPU Arrange shared memory Divide RHS calculation into 8 parts,let the memory requirement for each part can be sati

    20、sfied with shared memory For one RHS calculation,copy data from global memory to shared memory once and use shared memory in most timePut restrict-prolong to GPU After put RHS to GPU,the most time consuming part is Restrict-Prolong interpolation How to treat this part?The work is going onTest of GPU

    21、 acceleration on desktopOpenMP implementation AMSS-NCKU=Fortran90+C+C+used for program flow control and memory administration Fortran90 used for main numerical calculation Add OpenMP command in Fortran90 segmentsStructure of AMSS-NCKU GPU codeTwo groups MPI processes,one for cpu and one for gpuMPI+O

    22、penMP+CUDATest of AMSS-NCKU GPU codeTitan:top 1 super computer around the world(now Tianhe 2)1024x16 cores+1024 GPUsSummary Challenge from GW detection:AdvLIGO1:150 ALIA -1:1000 Parallel mesh level calculation method2x speed up GPU implementation to NR-have got roughly 5x speed up;30 x speed up?in progress 10 x in all is ready for science simulation

    展开阅读全文
    提示  163文库所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
    关于本文
    本文标题:大质量比双黑洞给数值相对论带来的挑战课件.ppt
    链接地址:https://www.163wenku.com/p-3494985.html

    Copyright@ 2017-2037 Www.163WenKu.Com  网站版权所有  |  资源地图   
    IPC备案号:蜀ICP备2021032737号  | 川公网安备 51099002000191号


    侵权投诉QQ:3464097650  资料上传QQ:3464097650
       


    【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。

    163文库