{"id":50,"date":"2016-08-14T05:49:33","date_gmt":"2016-08-14T05:49:33","guid":{"rendered":"http:\/\/betelge.wordpress.com\/?p=50"},"modified":"2016-08-14T05:49:33","modified_gmt":"2016-08-14T05:49:33","slug":"high-precision-floats-in-opengl-es-shaders","status":"publish","type":"post","link":"https:\/\/www.betelge.com\/blog\/2016\/08\/14\/high-precision-floats-in-opengl-es-shaders\/","title":{"rendered":"High precision floats in OpenGL ES shaders"},"content":{"rendered":"<p>In the <a title=\"Emulated double precision in OpenGL ES shader\" href=\"http:\/\/betelge.wordpress.com\/2014\/10\/05\/emulated-double-precision-in-opengl-es-shader\/\">previous post<\/a>\u00a0I tried emulating higher precision by using two\u00a010-bit precision floats, but couldn&#8217;t get it working.<\/p>\n<p><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-51\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3.png?w=300\" alt=\"2014-9-5_mandelbrot3\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-52\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot4.png?w=300\" alt=\"2014-9-5_mandelbrot4\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot4.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot4-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot4-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>As you can see in the lower part of the screenshots,\u00a0the precision\u00a0did increase but not by much. The upper part of the image remains blocky since the shader runs out of memory on my device and only updates part of the image. That&#8217;s because I revved up the number of iterations to 1024, but the final app should have that ability so this is another problem that needs to be solved. I also changed the color of the generated image to make the difference more apparent.<\/p>\n<h2>Generating a geometry that has one vertex per pixel<\/h2>\n<p>Let&#8217;s do the calculations in the vertex shader instead of the fragment shader. OpenGL ES guarantees high precision floats in vertex shaders but not in fragment shaders, so generating the fractal there should greatly increase the resolution. Vertex shaders are executed once per\u00a0polygon vertex and not per screen pixel, so we need to create a geometry that contains exactly one vertex per pixel of the screen and fill the screen with it.<\/p>\n<pre>public class Geometry {\n    private ShortBuffer indices;\n    private FloatBuffer buffer;\n    ...\n}\n\nprivate Geometry generatePixelGeometry(int w, int h) {\n    ShortBuffer indices = ShortBuffer.allocate(w*h);\n    FloatBuffer buffer = FloatBuffer.allocate(3*w*h);\n\n    for(int i = 0; i &lt; h; i++) {\n        for(int j = 0; j &lt; w; j++) {\n            buffer.put(-1 + (2*j + 1)\/(float)w);\n            buffer.put(-1 + (2*i + 1)\/(float)h);\n            buffer.put(0);\n\n            indices.put((short)(i*w+j));\n        }\n    }\n    buffer.flip();\n    indices.flip();\n\n    return new Geometry(indices, buffer);\n}<\/pre>\n<p>This will generate a geometry that contains one vertex in the middle of every pixel on a\u00a0screen of resolution <code>w<\/code> x <code>h<\/code>.\u00a0OpenGL ES 2 uses <code>short<\/code> indexing and\u00a0doesn&#8217;t support <code>int<\/code> indexing for vertices. That&#8217;s why we use a ShortBuffer for the indices buffer. This also means that we have to split up our Geometry into smaller pieces. The max value for a <code>short<\/code> is\u00a032767.\u00a0The best option is to generate one single geometry that\u00a0has fewer vertices than that and\u00a0reuse it several times across the screen using multiple draw calls. Either way we have to make sure that the final scene has one vertex in every pixel of the screen.<\/p>\n<h2>The vertex shader<\/h2>\n<pre>attribute vec3 position;\n\nuniform mat4 modelViewMatrix;\nuniform float MAX_ITER;\nuniform vec2 scale;\nuniform vec2 offset;\n\nvarying vec4 rgba;\n\nvec4 color(float value, float radius, float max);\n\nvec2 iter(vec2 z, vec2 c)\n{\n        \/\/ Complex number equivalent of z*z + c\n\treturn vec2(z.x*z.x - z.y*z.y, 2.0*z.x*z.y) + c;\n}\n\nvoid main(void)\n{\n\tvec4 posit = modelViewMatrix * vec4(position, 1.);\n\t\n\tvec2 c = 2.*(scale*posit.xy + offset);\n\t\n\tvec2 z = vec2(0.);\n\t\t\n\tint i;\n\tint max = int(MAX_ITER);\n\tfloat radius = 0.;\n\tfor( i=0; i&lt;max; i++ ) { \tradius = z.x*z.x + z.y*z.y; \t\tif( radius &gt; 16.) break;\n\t\tz = iter(z,c);\n\t}\n\t\n\tfloat value = (i == max ? 0.0 : float(i));\n\t\n\trgba = color(value, radius, MAX_ITER);\n\t\n\tgl_Position = posit;\n}<\/pre>\n<p>The vertex shader has to take the <code>modelViewMatrix<\/code> into account since we are splitting the geometry into multiple pieces and placing it at different positions in a scene. This had to be done because a single big geometry would be too big for <code>short<\/code>.<\/p>\n<h2>The fragment shader<\/h2>\n<pre>precision mediump float;\n\nvarying vec4 rgba;\n\nvoid main()\n{\n\tgl_FragColor = rgba;\n}<\/pre>\n<p>The fragment shader is as simple as it can be. It just receives the <code>rgba<\/code> vector from the vertex shader and draws it to the frame buffer.<\/p>\n<h2>Results<\/h2>\n<p><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-51\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3.png?w=300\" alt=\"2014-9-5_mandelbrot3\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot3-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-53\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot5.png?w=300\" alt=\"2014-9-5_mandelbrot5\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot5.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot5-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot5-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>That&#8217;s much better!<\/p>\n<p>The\u00a0fractal\u00a0renders slightly slower than with the lower precision shader but still really quickly and does 1024 iterations without\u00a0running out of memory . Let&#8217;s see how much we can zoom in now.<\/p>\n<p><a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot6.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-54\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot6.png?w=300\" alt=\"2014-9-5_mandelbrot6\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot6.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot6-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot6-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> <a href=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-55\" src=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot7.png?w=300\" alt=\"2014-9-5_mandelbrot7\" width=\"300\" height=\"180\" srcset=\"https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot7.png 800w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot7-300x180.png 300w, https:\/\/www.betelge.com\/blog\/wp-content\/uploads\/2014\/10\/2014-9-5_mandelbrot7-768x461.png 768w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>The pixels start filling out the screen at a scale of 0.0002, a zoom level of 5000x. Switching to the vertex mode instead shows an amazing difference. On my device highp floats in the vertex shader are normal 32-bit floats and have a precision of 23 bits, while the mediump 16-bit floats in the fragment shader only have a 10-bit precision. That means that the precision is more then doubled. Every pixel in fragment mode becomes in vertex mode bigger than the entire shader was in fragment mode. In fact pixels became apparent first on a scale of 2*10^(-6). See the results for your self in the google play: <a title=\"Google Play: GPU Mandelbrot\" href=\"https:\/\/play.google.com\/store\/apps\/details?id=tk.betelge.mandelbrot\" target=\"_blank\">GPU Mandelbrot<\/a>.<\/p>\n<p>In the next post I&#8217;ll give double emulation one more try.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous post\u00a0I tried emulating higher precision by using two\u00a010-bit precision floats, but couldn&#8217;t get it working. As you can see in the lower part of the screenshots,\u00a0the precision\u00a0did increase but not by much. The upper part of the &hellip; <a href=\"https:\/\/www.betelge.com\/blog\/2016\/08\/14\/high-precision-floats-in-opengl-es-shaders\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-50","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/posts\/50","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/comments?post=50"}],"version-history":[{"count":0,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/posts\/50\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/media?parent=50"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/categories?post=50"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.betelge.com\/blog\/wp-json\/wp\/v2\/tags?post=50"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}