{"id":62,"date":"2016-08-14T05:57:54","date_gmt":"2016-08-14T05:57:54","guid":{"rendered":"http:\/\/betelge.wordpress.com\/?p=62"},"modified":"2016-08-14T05:57:54","modified_gmt":"2016-08-14T05:57:54","slug":"emulated-64-bit-floats-in-opengl-es-shader","status":"publish","type":"post","link":"https:\/\/www.betelge.com\/blog\/2016\/08\/14\/emulated-64-bit-floats-in-opengl-es-shader\/","title":{"rendered":"Emulated 64-bit floats in OpenGL ES shader"},"content":{"rendered":"<p>In the <a href=\"https:\/\/betelge.wordpress.com\/2016\/08\/14\/high-precision-floats-in-opengl-es-shaders\/\">previous post<\/a>\u00a0the resolution of a Mandelbrot set fractal was greatly increased by generating it in a vertex shader instead of a fragment shader, because full highp 32 bit floats where available instead of the mediump 16-bit half-floats that many\u00a0mobile devices limit their fragment shaders to. The mantissa precision increased from 10 bits to 23 bits.<\/p>\n<p><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot8.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-63\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot8.png?w=300\" alt=\"2014-9-5_mandelbrot8\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot8.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot8-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot8-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot9.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-64\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot9.png?w=300\" alt=\"2014-9-5_mandelbrot9\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot9.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot9-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot9-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>At a scale of 10^(-4) the difference\u00a0is\u00a0obvious. Earlier I tried emulating a higher precision in the fragment shader inspired by\u00a0this <a title=\"Heavy computing with GLSL \u2013 Part 2: Emulated double precision\" href=\"https:\/\/thasler.com\/blog\/?p=93\" target=\"_blank\">blog post<\/a>, but it just wouldn&#8217;t work with the half-floats. It worked beautifully in the Android Emulator when I ran it on my desktop, but\u00a0my desktop always uses full 32-bit floats even for the mediump fragment shader floats, which we can&#8217;t expect in a mobile device fragment shader.<\/p>\n<p>Now when we are doing the calculation in the vertex shader instead we have full 32-bit floats on all mobile devices, just like the Android Emulator, so let&#8217;s try this again.<\/p>\n<h2>Representing a double with two floats<\/h2>\n<p>We&#8217;ll split the double up into a high and low part and put them in two floats<\/p>\n<pre>vec2 df;\ndf.x = high;\ndf.y = low<\/pre>\n<p>Since we&#8217;re using complex numbers that already are split up in real and imaginary components we&#8217;ll need four floats to represent a double complex number.<\/p>\n<pre>vec4 c;\nc.xy = vec2(hihg_real, low_real);\nc.zw = vec2(high_imag, low_imag);<\/pre>\n<p>We&#8217;re now using the full 128 bit vectors that GLSL supports to represent our complex values. We can use the primitive type <code>vec4<\/code>.<\/p>\n<p>We need methods to split, add and multiply these emulated doubles. See them in this\u00a0<a title=\"Heavy computing with GLSL \u2013 Part 2: Emulated double precision\" href=\"https:\/\/thasler.com\/blog\/?p=93\" target=\"_blank\">blog post<\/a>. They will need to be rewritten and optimized for complex numbers.\u00a0Since GLSL doesn&#8217;t have any operator overloading the code ends up looking a bit messy.<\/p>\n<pre>attribute vec3 position;\nuniform vec4 offset; \/\/vec4(hihg_x, low_x, high_y, low_y)\nuniform vec4 scale;\n...\nvec2 real = add(mul(scale.xy,position.x, offset.xy);\nvec2 imag = add(mul(scale.zw,position.y, offset.zw);\nvec4 c = vec4(real, imag);\nvec4 z = vec4(0.);\n...\nfor(i = 0; z.x*z.x + z.z*z.z &lt; ESCAPE_RADIUS; i++) {\n    z = add(mul(z,z), c);\n}\n...<\/pre>\n<p>We can&#8217;t interpolate the split attributes, but it never happens because there is one vertex per fragment. In fact I should try turning the interpolation off if possible. In the for-loop condition we only use the high parts of the real and imaginary values of <code>z<\/code>.<\/p>\n<p>The <code>add()<\/code> and <code>mul()<\/code> methods are hiding a big number of multiplications. This code is much slower then the previous shaders, but it does greatly increase the precision.<\/p>\n<p><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot10.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-69\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot10.png?w=300\" alt=\"2014-9-5_mandelbrot10\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot10.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot10-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot10-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot11.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-70\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot11.png?w=300\" alt=\"2014-9-5_mandelbrot11\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot11.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot11-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot11-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>Zooming in until the vertex method looks blocky at the scale of 10^(-7)\u00a0and switching to the emulated doubles method shows the huge improvement. What we&#8217;re looking at is a small corner\u00a0of the black dot that can be seen in the lower art of the images at the top of this post. Now we can zoom in all the way to a scale of 10^(-14) before we start seeing blocks. Whatever stopped the emulation from working properly in the fragment shader isn&#8217;t a problem in the vertex shader. This is\u00a0working great!<\/p>\n<h2>Results<\/h2>\n<p><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot13.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-71 aligncenter\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot13.png?w=300\" alt=\"2014-9-5_mandelbrot13\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot13.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot13-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot13-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a>This is a rendering at the scale of 10^(-13). Occasionally there are some linear discontinuities. I don&#8217;t know where these are coming from. Maybe from an\u00a0incorrect splitting of some doubles. That&#8217;s something to look into later.<\/p>\n<p><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/cropped-lcdqtf6faykehdrbur5kd9lxhu3dxwha-etjcm4awbtlehnpvdzhxtlefz45tsu3eqh900-rw-e1412415155844.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-37 size-full\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/cropped-lcdqtf6faykehdrbur5kd9lxhu3dxwha-etjcm4awbtlehnpvdzhxtlefz45tsu3eqh900-rw-e1412415155844.png\" alt=\"cropped-lcdqtf6faykehdrbur5kd9lxhu3dxwha-etjcm4awbtlehnpvdzhxtlefz45tsu3eqh900-rw-e1412415155844.png\" width=\"940\" height=\"198\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/cropped-lcdqtf6faykehdrbur5kd9lxhu3dxwha-etjcm4awbtlehnpvdzhxtlefz45tsu3eqh900-rw-e1412415155844.png 940w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/cropped-lcdqtf6faykehdrbur5kd9lxhu3dxwha-etjcm4awbtlehnpvdzhxtlefz45tsu3eqh900-rw-e1412415155844-300x63.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/cropped-lcdqtf6faykehdrbur5kd9lxhu3dxwha-etjcm4awbtlehnpvdzhxtlefz45tsu3eqh900-rw-e1412415155844-768x162.png 768w\" sizes=\"auto, (max-width: 940px) 100vw, 940px\" \/><\/a><\/p>\n<p>The shaders are working great in\u00a0<a title=\"Google Play: GPU Mandelbrot\" href=\"https:\/\/play.google.com\/store\/apps\/details?id=tk.betelge.mandelbrot\" target=\"_blank\">the app<\/a>!<br \/>\nYou can see the full source code on GitHub <a href=\"https:\/\/github.com\/betelge\/mandelbrot\/\">here<\/a>. The shaders are in the res\/raw folder.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous post\u00a0the resolution of a Mandelbrot set fractal was greatly increased by generating it in a vertex shader instead of a fragment shader, because full highp 32 bit floats where available instead of the mediump 16-bit half-floats that &hellip; <a href=\"https:\/\/www.betelge.com\/blog\/2016\/08\/14\/emulated-64-bit-floats-in-opengl-es-shader\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-62","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/posts\/62","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/comments?post=62"}],"version-history":[{"count":0,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/posts\/62\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/media?parent=62"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/categories?post=62"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/tags?post=62"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}