Interior Mapping – Part 2

In part 1, we discussed the requirements and rationale behind Interior Mapping. In this second part, we’ll discuss the technical implementation of what I’m calling (for lack of a better title) “Tangent-Space Interior Mapping”.

Coordinates and Spaces

In the original implementation, room volumes were defined in object-space or world-space. This is by far the easiest coordinate system to work in, but it quickly presents a problem! What about buildings with angled or curved walls? At the moment, the rooms are bounded by building geometry, which can lead to extremely small rooms in odd corners and uneven or truncated walls!

In reality, outer rooms are almost always aligned with the exterior of the building. Hallways rarely run diagonally and are seldom narrower at one end than the other! We would rather have all our rooms aligned with the mesh surface, and then extruded inward towards the “core” of the building.

Cylindrical Building

Curved rooms, just by changing the coordinate basis.

In order to do this, we can just look for an alternative coordinate system for our calculations which lines up with our surface (linear algebra is cool like that). Welcome to Tangent Space! Tangent space is already used elsewhere in shaders. Even wonder why normal-maps are that weird blue color? They actually represent a series of directions in tangent-space, relative to the orientation of the surface itself. Rather than “Forward”, a Z+ component normal map points “Outward”. We can simply perform the raycast in a different coordinate basis, and suddenly the entire problem becomes surface-relative in world-space, while still being axis-aligned in tangent space! A neat side-effect of this is that our room volumes now follow the curvature of the building, meaning that curved facades will render curved hallways running their length, and always have a full wall parallel to the building exterior.

While we’re at it, what if we used a non-normalized ray? Most of the time, a ray should have a normalized direction. “Forward” should have the same magnitude as “Right”. If we pre-scale our ray direction to match room dimensions, then we can simplify it out of the problem. So now, we’re performing a single raycast against a unit-sized axis-aligned cube!

Room Textures

The original publication called for separate textures for walls, floors, and ceilings. This works wonderfully, but I find it difficult to work with. Keeping these three textures in sync can get difficult, and atlasing multiple room textures together quickly becomes a pain. Alternative methods such as the one proposed by Zoe J Wood in “Interior Mapping Meets Escher” utilizes cubemaps, however this makes atlasing downright impossible, and introduces new constraints on the artists building interior assets.
interior_atlas
Andrew Willmott briefly touched on an alternative in “From AAA to Indie: Graphics R&D”, which used a pre-projected interior texture for the interior maps in SimCity. This was the format I decided to use for my implementation, as it is highly author-able, easy to work with, and provides results only slightly worse than full cubemaps. A massive atlas of room interiors can be constructed on a per-building basis, and then randomly selected. Buildings can therefore easily maintain a cohesive interior style with random variation using only a single texture resource.

Finally, The Code

I’ve excluded some of the standard Unity engine scaffolding, so as to not distract from the relevant code. You won’t be able to copy-paste this, but it should be easier to see what’s happening as a result.

v2f vert (appdata v) {
   v2f o;
   
   // First, let's determine a tangent basis matrix.
   // We will want to perform the interior raycast in tangent-space,
   // so it correctly follows building curvature, and we won't have to
   // worry about aligning rooms with edges.
   half tanSign = v.tangent.w * unity_WorldTransformParams.w;
   half3x3 objectToTangent = half3x3(
      v.tangent.xyz,
      cross(v.normal, v.tangent) * tanSign,
      v.normal);

   // Next, determine the tangent-space eye vector. This will be
   // cast into an implied room volume to calculate a hit position.
   float3 oEyeVec = v.vertex - WorldToObject(_WorldSpaceCameraPos);
   o.tEyeVec = mul(objectToTangent, oEyeVec);

   // The vertex position in tangent-space is just the unscaled
   // texture coordinate.
   o.tPos = v.uv;

   // Lastly, output the normal vertex data.
   o.vertex = UnityObjectToClipPos(v.vertex);
   o.uv = TRANSFORM_TEX(v.uv, _ExteriorTex);

   return o;
}

fixed4 frag (v2f i) : SV_Target {
   // First, construct a ray from the camera, onto our UV plane.
   // Notice the ray is being pre-scaled by the room dimensions.
   // By distorting the ray in this way, the volume can be treated
   // as a unit cube in the intersection code.
   float3 rOri = frac(float3(i.tPos,0) / _RoomSize);
   float3 rDir = normalize(i.tEyeVec) / _RoomSize;

   // Now, define the volume of our room. With the pre-scale, this
   // is just a unit-sized box.
   float3 bMin = floor(float3(i.tPos,-1));
   float3 bMax = bMin + 1;
   float3 bMid = bMin + 0.5;

   // Since the bounding box is axis-aligned, we can just find
   // the ray-plane intersections for each plane. we only 
   // actually need to solve for the 3 "back" planes, since the 
   // near walls of the virtual cube are "open".
   // just find the corner opposite the camera using the sign of
   // the ray's direction.
   float3 planes = lerp(bMin, bMax, step(0, rDir));
   float3 tPlane = (planes - rOri) / rDir;

   // Now, we know the distance to the intersection is simply
   // equal to the closest ray-plane intersection point.
   float tDist = min(min(tPlane.x, tPlane.y), tPlane.z);

   // Lastly, given the point of intersection, we can calculate
   // a sample vector just like a cubemap.
   float3 roomVec = (rOri + rDir * tDist) - bMid;
   float2 interiorUV = roomVec.xy * lerp(INTERIOR_BACK_PLANE_SCALE, 1, roomVec.z + 0.5) + 0.5;

#if defined(INTERIOR_USE_ATLAS)
   // If the room texture is an atlas of multiple variants, transform
   // the texture coordinates using a random index based on the room index.
   float2 roomIdx = floor(i.tPos / _RoomSize);
   float2 texPos = floor(rand(roomIdx) * _InteriorTexCount) / _InteriorTexCount;

   interiorUV /= _InteriorTexCount;
   interiorUV += texPos;
#endif

   // lastly, sample the interior texture, and blend it with an exterior!
   fixed4 interior = tex2D(_InteriorTex, interiorUV);
   fixed4 exterior = tex2D(_ExteriorTex, i.uv);

   return lerp(interior, exterior, exterior.a);
}

And that’s pretty much all there is to it! The code itself is actually quite simple and, while there are small visual artifacts, it provides a fairly convincing representation of interior rooms!

 

Interior + Exterior Blend

 

There’s definitely more room for improvement in the future. The original paper supported animated “cards” to represent people and furniture, and a more realistic illumination model may be desirable. Still, for an initial implementation, I think things came out quite well!

Interior Mapping – Part 1

Rendering convincing environments in realtime has always been difficult, especially for games which take place at a “human” scale. Games consist of a series of layered illusions and approximations, all working (hopefully) together to achieve a unified goal; to represent the world in which the game takes place. In the context of a simplified or fantastical world, this isn’t too bad. It’s a matter of creating a unified style and theme that feels grounded in the reality of the particular game. The fantastic narrative platformer “Thomas was Alone”, for example, arguably conveys a believable world using just shape and color. As soon as a game takes place in an approximation of our real world however, the cracks start to appear. There are a tremendous number of “details” in the real world. Subtle differences on seemingly identical surfaces that the eye can perceive, even if not consciously.

uncanny valley

This CG incarnation of Dwayne Johnson as the titular “Scorpion King” is a prime example of “The Uncanny Valley”

We as humans are exceptionally good at identifying visual phenomena, and more importantly, its absence. You may have heard this referred to as “The Uncanny Valley”; when something is too realistic to be considered cute or cartoony, but too unrealistic to look… right… It’s extremely important to include some representation of those “missing” pieces, even if they’re not 100% accurate in order to preserve the illusion.

While not nearly as noticeable at first glance, missing details in an environment are equally important to preserving the illusion of a living, breathing, virtual world.

Take, for example, this furniture store from GTA IV.

GTA Furniture Store.png

A nice looking furniture store, though something’s missing…

This is a very nice piece of environment art. It’s visually interesting, it fits the theme and location, and it seems cohesive within the world… though something is amiss. The view through the windows is clearly just a picture of a store, slapped directly onto the window pane, like a sticker on the glass! There’s no perspective difference between the individual windows on different parts of the facade. The view of the interior is always head-on, even if the camera is at an angle to the interior walls. This missing effect greatly weakens the illusion.

From this, the question arises…

How do we convey volume through a window, without creating tons of work for artists, or dramatically altering the production pipeline?

Shader Tricks!

The answer (as you may have guessed from the header) lies in shader trickery! To put it simply, Shaders are tiny programs which take geometric information as input, mush it around a bunch, and output a color. Our only concern is that the final output color looks correct in the scene. What happens in the middle frankly doesn’t matter much. If we offset the output colors, we can make it look like the input geometry is offset too! If outputs are offset non-uniformly, it can be made to appear as though the rendered image is skewed, twisted, or distorted in some way.

uncanny valley

If you’ve ever seen at 3D sidewalk art, you’ve seen a real-world implementation of parallax mapping.

The school of techniques collectively known as “Parallax Mapping” do just this. Input texture coordinates are offset based on the observer angle, and a per-texel “depth” value. By determining the point where our camera ray intersects the surface height-field, we can create what amounts to a 3D projection of an otherwise 2D image. “Learn OpenGL” provides an excellent technical explanation of parallax mapping if you’re curious.

While the theory is perfect for our needs, the methodology is lacking. Parallax mapping is not without its issues! Designed to be a general-purpose solution, it suffers from a number of visible artifacts when used in our specific case. It works best on smoother height-fields, for instance. Large differences in height between texels can create weird visual distortions! There are a number of alternatives to get around this issue (such as “Steep Parallax Mapping”), but many are iterative, and result in odd “step” artifacts as the ratio of depth to iteration count increases. In order to achieve a convincing volume for our buildings using an unmodified parallax shader, we’d need to use so many iterations that it would quickly become a performance nightmare.

Interior Mapping

Parallax mapping met nearly all of our criteria, but still wasn’t suitable for our application. Whenever a general solution fails, it’s usually a good idea to sit down and consider the simplest possible specific solution that will suffice.

Raymarch

For each point on the true geometry (blue), select a color at the point of intersection between the camera ray, and an imaginary room volume (red).

In our case, we want rectangular rooms inset into our surface. The keyword here is “rectangular”. The generality of parallax mapping means that an iterative numeric approach must be used, since there is no analytical way to determine where our camera ray intersects a general height-field. If we limit the problem to only boxes, then an exact solution is not only possible, but trivial! Furthermore, if these boxes are guaranteed to be axis-aligned, the computation becomes extremely simple! Then, it’s just a matter of mapping the point of intersection within our room volume to a texture, and outputting the correct color!

interior mapping example

Example of “Interior Mapping” from the original publication by Joost van Dongen.

Originally published in 2008, the now well known “Interior Mapping”, by Joost van Dongen seems like a prime candidate! In this approach, the facade of a building mesh is divided into “rooms”, and a raycast is performed for each texel. Then, the coordinate at the point of intersection between our camera ray and the room volume can be used to sample a set of “Room Textures”, and voila! This, similar to parallax mapping, offsets input texture coordinates to provide a projection of a wall, ceiling, and floor texture within each implicit “room volume”, resulting in a geometrically perfect representation of an interior without the added complexity of additional geometry and material work!

In part 2, we’ll discuss modifications to the original implementation for performance and quality-of-life improvements!

Abusing Blend Modes for Fun and Profit!

Today I decided to do a quick experiment.

Hardware “blend modes” have existed since the dawn of hardware-accelerated graphics. Primarily used for effects like transparency, they allow a developer to specify the way new colors are drawn into the buffer through a simple expression.

color = source * SrcFactor + destination * DstFactor

The final output color is the sum of a “source factor” term multiplied by the value output by the fragment shader, and a “destination factor” term multiplied by the color already in the buffer.

For example, if I wanted to simply add the new color into the scene, I could use blend modes of One One; Our coefficients would be negligible and we would end up with

color = source + destination

If I wanted a linear alpha blend between the source color and destination color, I could select the terms SrcAlpha, OneMinusSrcAlpha, which would perform a linear interpolation between the two colors.

But what happens when we have non-standard colors? Looking back at the blend expression, logic would dictate that we can express any two-term polynomial as long as the terms are independent, and the coefficients are one of the supported “blend factors”! By pre-loading our destination buffer with a value, the second term can be anything we need, and the alpha channel of our source can be packed with a coefficient to use as the destination factor if need be.

This realization got me thinking. “Subtract” blend modes aren’t explicitly supported in OpenGL, however a subtraction is simply the addition of a negative term. If our source term were negative, surely blend factors of One One would simply subtract the source from the destination color! That isn’t to say that this is guaranteed to work without issues! If the render target is a traditional 24 or 32-bit color buffer, then negative values may have undefined behavior! A subtraction by addition of a negative would only work assuming the sum is calculated independently somewhere in hardware, before it’s packed into the unsigned output buffer.

Under these assumptions, I set out to try my hand at a neat little trick. Rendering global object “thickness” in a single pass.

Why though?

Thickness is useful for a number of visual effects. Translucent objects, for example, could use the calculated thickness to approximate the degree to which light is absorbed along the path of the ray. Refraction could be more accurately approximated utilizing both incident, and emergent light calculations. Or, you could define the shape of a “fog volume” as an arbitrary mesh. It’s actually quite a useful thing to have!

Single pass global thickness maps

So here’s the theory. Every pixel in your output image is analogous to a ray cast into your scene. It can be thought of as a sweep backwards along the path of light heading towards the camera. What we really want to determine is the point where that ray enters and exits our object. Knowing these two points, we also essentially know the distance travelled through the volume along that ray.

It just so happens that we know both of these things! The projective-space position of a fragment must be calculated before a color can be written into a buffer, so we actually know the location of every fragment, or continuing the above analogy, ray intersection on the surface. This is also true of the emergent points, which all lie on the back-faces of our geometry! If we can find the distance the ray has traveled before entering our volume, and the distance the ray has traveled before exiting it, the thickness of the volume is just the difference of the two!

So how is this possible in a single pass? Well, normally when we render objects, we explicitly disable the “backfaces”; triangles pointing away from our camera. This typically speeds things up quite a bit, because backfaces almost certainly lie behind the visible portion of our model, and shading them is simply a waste of time. If we render them however, our fragment program will be executed both on the front and back faces! By writing the distance from the camera, or “depth” value as the color of our fragment, and negating it for front-faces, we can essentially output the “back minus front” thickness value we need!

DirectX provides a convenient semantic for fragment programs. float:VFACE. This value will be set to 1 when the fragment is part of a front-face, and -1 when the fragment is part of a back-face. Just render the depth, multiplied by the inverted value of the VFACE semantic, and we’ve got ourselves a subtraction!

Cull Off // disable front/back-face culling
Blend One One // perform additive (subtractive) blending
ZTest Off // disable z-testing, so backfaces aren’t occluded.

fixed4 frag (v2f i, fixed facing : VFACE) : SV_Target {
return -facing * i.depth;
}

Unity Implementation

From here, I just whipped up a quick “Camera Replacement Shader” to render all opaque objects in the scene using our thickness shader, and drew the scene to an off-screen “thickness buffer”. Then, in a post-effect, just sample the buffer, map it to a neat color ramp, and dump it to the screen! In just a few minutes, you can make a cool “thermal vision” effect!

Issues

The subtraction blend isn’t necessarily supported on all hardware. It relies on a lot of assumptions, and as such is probably not appropriate for real applications. Furthermore, this technique really only works on watertight meshes. Meshes with holes, or no back-faces will have a thickness of negative infinity, which is definitely going to cause some problems. There are also a number of “negative poisoning” artifacts, where the front-face doesn’t necessarily overlap a corresponding backface, causing brief pixel flickering. I think this occasional noise looks cool in the context of a thermal vision effect, but there’s a difference between a configurable “glitch” effect, and actual non-deterministic code!

Either way, I encourage everyone to play around with blend-modes! A lot of neat effects can be created with just the documented terms, but once you get into “probably unsafe” territory, things start to get really interesting!

GPU Isosurface Polygonalization

Isosurfaces are extremely useful when it comes to data visualization. From medical imaging to fluid flow analysis, they are an excellent tool for understanding complex volumetric data. Games have also adopted some of these techniques for their on purposes From the more rigid implementation in the ubiquitous game Minecraft to the Gels in Portal 2, these techniques serve the same basic purpose.

I wanted to try my hand at a general-purpose implementation, but before we dive into things, we must first answer a few basic questions.

What is an isosurface?

An isosurface can be thought of as the solution of a continuous function which produces a constant output in 3D. If you’re visualizing an electromagnetic field for example, you might generate an isosurface for a given potential, so you can easily determine its overall shape. This technique can be applied to other arbitrary values as well. Given CT scan data, a radiologist could construct an isosurface at the density of a specific type of tissue, extracting a 3D representation of bones or organs to view them separately, rather than having to manipulate a less intuitive stack of images.

What will we use this system for?

I don’t work in the medical field, nor would I trust the accuracy of my implementation when it comes to making a diagnosis. I work in entertainment and computer graphics, and as you would imagine, the requirements are quite different. Digital artists can already craft far better visuals than any procedure yet known; the real challenge is dynamic data. Physically simulated fluids, player-modifiable terrain, mechanics such as these present a significant challenge for traditional artists! What we really need is a generalized system for extracting and rendering isosurfaces in real time to fill in the gaps.

What are the requirements for such a system?

Given our previous use case, we can derive a few basic requirements. In no particular order…

  1. The system must be intuitive. Designers have other things to do besides tweaking simulation volumes and fiddling with configurations.
  2. The system must be flexible. If someone suggests a new mechanic which relies heavily on procedural geometry, it should be easy to get up and running.
  3. The system must be compatible. The latest experimental extensions are fun, but if you want to release something that anyone can enjoy, it needs to run on 5 year old hardware.
  4. The system must be fast. At 60 fps, you only have 16ms to render everything in your game. We can’t spend 10 of that drawing a special effect.

Getting Started!

Let’s look at requirement no. 4 first. Like many problems in computing, surface polygonalization can be broken down into repeated instances of much smaller problems. At the end of the day, the desired output is a series of interconnected polygons which appear to make up a complex surface. If each of these component polygons is accounted for separately, we can dramatically reduce the scope of the problem. Instead of generating a polygonal surface, we are now generating a single polygon, which is a much less daunting task. As with any discretization process, it is necessary to define a regular sample interval at which our continuous function will be evaluated. In the simplest 3D case, this will take the form of a regular grid of cells, and each of these cells will form a single polygon. Suddenly, this polygonalization process becomes massively parallel. With this new outlook, the problem becomes a perfect fit for standard graphics hardware!

For compatibility, I chose to implement this functionality in the Geometry Shader stage of the rendering pipeline, as it allows for the creation of arbitrary geometry given some basic input data. A Compute Shader would almost definitely be a better option in terms of performance and maintainability, but my primary development system is OSX, which presents a number of challenges when it comes to the use of Compute Shaders. I intend to update this project in the future, once Compute Shaders become more common.

If the field is evaluated at a number of regular points and a grid is drawn between them, we can construct a set of hypothetical cubes with a single sample at each of its 8 vertices. By comparing the values at each vertex, it is trivial to determine the plane of intersection between the theoretical isosurface and the cubic sample volume. If properly evaluated, the local solutions for each sample volume will form integral parts of the global surface implicitly, without requiring any global information.

This is the basic theory behind the ubiquitous Marching Cubes algorithm, first published in 1987 and still commonly used today. While it is certainly battle-tested, there are a number of artifacts in the output geometry that can make surfaces appear rough. The generated triangles are also often non-uniform and narrow, leading to additional artifacts later in the rendering process. Perhaps a more pressing issue is the sheer number of cases to be evaluated. For every sample cell, there are 256 possible planar intersections. The fantastic implementation by Paul Bourke wisely recommends the use of a look-up table, pre-computing these cases. While this may work well in traditional implementations, it crumbles under the parallel architecture of modern GPUs. Graphics hardware tends to excel at executing large batches of identical instructions, but begins to falter as soon as complex conditional branching is involved and operations have to be evaluated individually. In my tests, I found that the look-up tables performed no better, if not worse than explicit evaluation, as the complier could not easily expand and unroll the program flow, and evaluation could not be easily batched. Bearing this in mind, we ideally need to implement a method with as few logical branches as possible.

Marching Tetrahedra is a variant of the Marching Cubes algorithm which divides each cube into 5 (or 6 for a slightly different topology) tetrahedra. By limiting our integral sample volume to four vertices instead of 8, the number of possible cases drops to 16. In my tests, I got a 16x performance improvement using this technique (though realized savings are heavily dependent on the hardware used), confirming our suspicions. Unfortunately, marching tetrahedra can have some strange surface features, and has a number of artifacts of its own, especially with dynamic sampling grids.

Because of this, I ended up settling on naive surface nets, a simple dual method which generates geometry spanning multiple voxel sample volumes. An excellent discussion of the differences between these three meshing algorithms can be found here. Perhaps my favorite advantage of this method is the relative uniformity of the output geometry. Surface nets tend to be smooth surfaces comprised of quads of relatively equal size. In addition, I find it easier to comprehend and follow than other meshing algorithms, as its use of look-up-tables, and possible cases is fairly limited.

Implementation Details

Isosurface_Sample.png

The sample grid is actually defined as a mesh, with a single disjoint vertex placed at each integral sample coordinate. These vertices aren’t actually drawn, but instead are used as input data to a series of shaders. Therefore, the shader can be considered to be executed “per-voxel”, with its only input being the coordinate of the minimum bounding corner. One disadvantage commonly seen in similar systems is a fundamental restriction on surface resolution due to a uniform sample grid. In order to skirt around this limitation, meshing is actually performed. In projected space, rather than world space, so each voxel is a truncated frustum similar to that of the camera, rather than a cube. This not only eliminates a few extra transformations in the shader code, but provides LoD implicitly by ensuring each output triangle is of a fixed pixel size, regardless of its distance to the camera.

Once the sample mesh was created, I used a simple density function for the potential field used by this system. It provides a good amount of flexibility, while still being simple to comprehend and implement. Each new source of “charge” added to the field would contribute additively to the overall potential.  However, this quickly raises a concern! Each contributing charge must be evaluated at all sample locations, meaning our shader must, in some way, iterate through all visible charges! As stated earlier, branching and loops which cannot be unrolled can cause serious performance hiccups on most GPUs!

While I was at it, I also implemented a Vertex Pre-pass. Due to the nature of GPU parallelism, each voxel is evaluated in complete isolation. This has the unfortunate side-effect of solving for each voxel vertex position up to 6 times (once for each neighboring voxel). The surface net algorithm utilizes an interpolated surface vertex position, determined from the intersections of the surface and the sample volume. This interpolation can get expensive if repeated 6 times more than necessary! To remedy this, I instead do a pre-pass calculating the interpolated vertex position, and storing it as a normalized coordinate within the voxel in the pixel color of another texture. When the geometry stage builds triangles, it can simply look up the normalized vertex positions from this table, and spit them out as an offset from the voxel min coordinate!

The geometry shader stage is then fairly straightforward, reading in vertex positions, checking the case of the input voxel, looking up the vertex positions of its neighbors, and bridging the gap with a triangle.

Was it worth it?

Short answer, no.

I am extremely proud of the work I’ve done, and the end result is quite cool, but it’s not a solution I would recommend in a production setting. The additional complexity far outweighs any potential performance benefit, and maintainability, while not terrible, takes a hit as well. In addition, the geometry shader approach doesn’t work nearly as well as I had hoped. Geometry shaders are notoriously cache-unfriendly, and my implementation is no exception. Combine this with the rather unintuitive nature of working with on-GPU procedural geometry in a full-scale project, and you’ve got yourself a recipe for very unhappy engineers.

I think the concept of on-GPU surface meshing is fascinating, and I’m eager to look into Compute Shader implementations, but as it stands, the geometry stage is not the way to go.

I’ve made the source available on my GitHub if you’d like to check it out!

Messing With Shaders – Realtime Procedural Foliage

 

ivy_close.png
The programmable rendering pipeline is perhaps one of the largest advances in the history of realtime computer graphics. Before its introduction, graphics libraries like OpenGL and DirectX were limited to the “fixed function pipeline”, a programmer would shove in geometric data, and the application would draw it however it saw fit. Developers had little to no control over the output of their application beyond a few “render mode” settings. This was fine for rendering relatively simple scenes, solid objects, and simplistic lighting, but as visual fidelity increased and hardware become more powerful it quickly became necessary to allow for a more customizable rendering.

The process of rendering a 3D object in the modern programmable pipeline is typically broken down into a number of steps. Data is copied into fast-access graphics memory, then transformed through a series of stages before the graphics hardware eventually rasterizes that data to the display. In its most basic form, there are two of these stages the developer can customize. The “Vertex Program” manipulates data on a per-vertex level, such as positions and texture coordinates, before handing the results on to the “Fragment Program”, which is responsible for determining the properties of a given fragment (like a pixel containing more than just color information). The addition of just these two stages opened the floodgates for interesting visual effects. Approximating reflections for metallic objects, cel-shading effects for cartoon characters, and more! Since then, even more optional stages have been inserted into the pipeline for an even greater variety of effects.

I’ve spent a considerable amount of time experimenting with vertex and fragment programs in the past, but this week I decided to spend a few hours working with the other, less common stages, mainly “Geometry Programs”. Geometry programs are a more recent innovation, and have only began to see extensive use in the last decade or so. They essentially allow developers to not only modify vertex data as it’s received, but to construct entirely new vertices based on the input primitives (triangles, quads, etc.) As you can easily imagine, this presents incredible potential for new effects, and is something I personally would like to become more experienced with.

In four or five hours, I managed to write a relatively complex effect, and the rest of this post will detail, at a high level, what I did to achieve it.

ivy_distant.png

Procedurally generated geometry for ivy growing on a simple building.

This is my procedural Ivy shader. It is a relatively simple two-pass effect which will apply artist-configurable ivy to any surface. What sets this effect apart from those I’ve written in the past is that it actually constructs new geometry to add 3D leaves to the surface extremely efficiently.

One of the major technical issues when it comes to rendering things like foliage is that the level of geometric detail required to accurately represent leaves is quite high. While a digital environment artist could use a 3D modeling program to add in hundreds of individual leaves, this is not necessarily a good use of their time. Furthermore, it quickly becomes unmaintainable if anyone decides that the position, density, or style of foliage should change in the future. I don’t know about you, but I don’t want to be the one to have to tell a team of environment artists that all of the ivy in an entire game needs to be slightly different. In this situation, the key is to work smarter, not harder. While procedural art is often controversial in the game industry, I think most developers would agree that artist-directed procedural techniques are an invaluable tool.

Ivy_Shader_Steps.png
First and foremost, my foliage effect is composed of two separate rendering passes. First, a triplanar-mapped base texture is blended onto the object based on the desired density of the ivy. This helps to make the foliage feel much more dense, and helps to hide the seams where the leaves meet the base geometry.

Next in a second rendering pass, the geometry program transforms every input triangle into a set of quads lying on that triangle with a uniform, psuedo-random distribution. First, it is necessary to determine the number of leaf quads to generate. In order to maintain a consistent density of leaf geometry, the surface area of the triangle is calculated quickly using the “half cross-product formula”, and is then multiplied by the desired number of leaves per square meter of surface area. Then, for each of these leaves, a random sample point on the triangle is picked, and a triangle strip is emitted. It does this by sampling a noise function seeded with the world-space centroid of the triangle and the index of the leaf quad being generated. These noise values are then used to generate barycentric coordinates, which in turn are used to interpolate the position and normal of the triangle at that point, essentially returning a random world-space position and its corresponding normal vector.

Now, all that’s needed is to determine the orientation of the leaf, and output the correct triangle-strip primitive. Even this is relatively simple. By using the world-space surface normal and world “up” vector, a simple “change of vector basis” matrix is constructed. Combining this with a slightly randomized scale factor, and a small offset to orientation (to add greater variety to patches of leaves), we can transform normalized quad vertices into the exact world-space positions we want for our leaves!

...

// Defines a unit-size square quad with its base at the origin. doing
// this allows for very easy scaling and positioning in the next steps.
static const float3 quadVertices[4] = {
   float3(-0.5, 0.0, 0.0),
   float3( 0.5, 0.0, 0.0),
   float3(-0.5, 0.0, 1.0),
   float3( 0.5, 0.0, 1.0)
};

...

// IN THE GEOMETRY SHADER
// Change of basis matrix converts from XYZ space to leaf-space
float3x3 leafBasis = float3x3(
   leafX.x, leafY.x, leafZ.x,
   leafX.y, leafY.y, leafZ.y,
   leafX.z, leafY.z, leafZ.z
);

// constructs a random rotation matrix from Euler angles in the range 
// (-10,10) using wPos as a seed value.
float3x3 leafJitter = randomRotationMatrix(wPos, 10);

// Combine the basis matrix by the random rotation matrix to get the
// complete leaf transformation. Note, we could use a 4x4 matrix here
// and incorporate the translation as well, but it's easier to just add
// the world position as an offset in the final step.
float3x3 leafMatrix = mul(leafBasis, leafJitter);

// lastly, we can just output four vertices in a triangle strip
// to form a simple quad, and we'll be on our merry way.
for ( int i = 0; i < 4; i ++ ) {
   FS_INPUT v;
   v.vertex = UnityWorldToClipPos( 
      float4( mul(leafMatrix, quadVertices[i] * scale), 1) + wPos 
   );
   triStream.Append(v);
}

At this point, the meat of the work is done! We’ve got a geometry shader outputting quads on our surface. The last thing needed is to texture them, and it works!

Configuration!

I briefly touched on artist-configurable effects in the introduction, and I’d like to quickly address that too. I opted to go with the simplest solution I could think of, and it ended up being incredibly effective.

venus_vertex_weights.png

Configuring procedural geometry using painted vertex weights.

The density and location of ivy is controlled through painted vertex-colors. This allows artists to simply paint sections of their model they would like to be covered in foliage, and the shader will use this to weight the density and distribution of the procedural geometry. This way, an environment artist could use the tools they’re familiar with to quickly sketch out what parts of a model they would like to be effected by the shader. It will take an experienced artist less than a minute to get a rough draft working in-engine, and changes to the foliage can be made just as quickly!

At the moment, only the density of the foliage is mapped this way (All other parameters are uniform material properties), but I intend to expand the variety of properties which can be expressed this way, allowing for greater control over the final look of the model.

TODOs!

This ended up being an extremely informative project, but there are many things still left to do! For one, the procedural foliage does not take lighting into account. I built this effect in the Unity game engine, and opted out of using the standard “Surface Shader” code-generation system, which while very useful in 99% of cases, is extremely limiting in situations such as this. I would also like to improve the resolution of leaf geometry, applying adaptive runtime tessellation to the generated primitives in order to give them a slight curve, rather than displaying flat billboards. Other things, such as color variation on leaves could go a long way to improving the effect, but for now I’m quite satisfied with how it went!

Whelp, on to the next one!

 

tsGL – Improvements

I worked a bit more on tsGL over the last few days, and managed to clean up a few things that were really bothering me! So far progress has been relatively smooth and it’s actually turning out quite well! So, what changed?

Lighting!
TSGL’s lighting code is much cleaner now, and seems to work pretty well! Lighting is broken down into a multi-pass system which allows for an arbitrary number of lights to be applied to each object. Take this scene, for example…

Composite.png

tsGL – a scene featuring three point lights. Red, blue, and white.

This scene features three realtime lights, a red light in the back left corner, a blue light behind the camera, and a white light above the scene. All of these lights are drawn as “point-lights”, meaning that they act as omnidirectional sources.

First, the scene is drawn with no lights applied. This is important to capture shader-specific details on each object, such as emissive textures, reflections, and unlit details.

Next, lights are sorted based on their “importance”. This is calculated based on distance to the object being rendered and the intensity of the light. If there’s a large, bright light shining on an object, it will be drawn first, followed by all other lights until we’ve either reached the maximum allowed number.

Then, light parameters are packed into a 4×4 matrix. This may seem odd, but it also means that all attributes of a light can be passed as a single input to the GLSL shader. This allows for a large amount of flexibility in designing shader programs, as well as the convenience of not requiring several uniform variables to be defined in each one.

The Vertex Shader calculates a set of values useful for lighting, mainly the per-vertex direction of incident light, and the attenuation of brightness over distance. These are calculated per-vertex because it reduces the number of necessary calculations significantly, and small imperfections due to interpolation over the triangle are largely imperceptible!

attenuation_withsource.png

Attenuation of one of the light in the scene, visualized in false color. Ranges from red (high intensity) to green (low intensity)

By scaling the intensity of the light with the square of the distance from the source, lights will appropriately grow dimmer as they are moved farther from an object. The diffuse component of the light is also calculated per-fragment using the typical Lambertian reflectance model, and ensures that only the “light-facing side” of objects are shaded. In the above image, the intensity of a red light throughout the scene is visualized in false color, and the final diffuse light calculation is shown on the bottom right.

Awesome, but at this point our scene is just an unlit void! How do we combine the output of each light pass into a final image?

By exploiting OpenGL blend-modes, we can produce exactly the effect we want! OpenGL allows the programmer to specify an active “blend-mode”, essentially determining how new data is written to the display buffer! This is primarily used for rendering transparent objects. A window pane for instance, would need to be rendered over top of the rest of a scene, and would mix the background color with the color of the glass itself to produce a final color! This is no different!

For these lights, the OpenGL Blend Mode is set to “additive”. This will literally add together the colors of every object drawn, which in the case of lights is just what we need. Illumination is a purely additive process, and it is impossible for a light to make things darker. Because of this, simply adding the effects of several lights together will output the illuminated scene as a whole! The best part is that it works without having to pass an array of lighting information to the shader, or arbitrarily limiting the number of available lights based on hardware! While the overhead of rendering an additional pass is non-trivial, it’s a small price to pay for the flexibility allowed by this approach.

Here, we can see the influence of each of the three lights.

lights.png

The additive passes of each of the three lights featured in the scene above. By summing together these three images, we obtain the fully illuminated scene.

At the end of the day, we end up with a process that looks like this.

  1. Clear your drawing buffers. (erases the previous frame so we have a clean slate.)
  2. Draw the darkened scene.
  3. Sort the lights based on their “importance”.
  4. Set the blend mode to “additive”.
  5. For each light in the scene (in order of importance)
    1. Draw the scene again, illuminated by the light.
  6. We’re done! Display the buffer!

This solution isn’t perfect, and more powerful techniques have been described in recent years, but given the restrictions of WebGL, I find this technique to work quite well. One feature I would like to add is for the scene to only draw objects effected by a light in the additive pass, rather than the entire scene over again. This allows us to skip any calculations that would not effect an object in some way, and may increase performance, though without further testing, it’s difficult to say for sure.

Cubemaps!
This is always a fun feature to add, because it can have incredibly apparent results. Cubemaps are essentially texture-maps that exist on all sides of a cube. Rather than sampling a single point for a color, you would sample a direction, returning the color at that “angle” within the cube. By providing an image for each face, a cubemap can be built to represent lighting information, the surrounding environment, or whatever else would require a 360 texture lookup!

cubemaps_skybox

Example of a cubemap, taken from “LearnOpenGL.com”

tsGL now supports cubemaps as a specific instance of a “texture”, and they can be mapped to materials and used identically in the engine! One of the clever uses of a cubemap is called “environment mapping”, which essentially boils down to emulating reflection by looking up the color of the surrounding area in a precomputed texture. This is far more efficient than actually computing reflections dynamically, and plays much more nicely within the paradigms of traditional computer graphics! Here’s a quick example of an environment-mapped torus running in tsGL!

yakf0.gif

An environment-mapped torus, showcasing efficient reflection.

Now that cubemaps are supported, it’s also possible to make reflective and refractive materials efficiently, so shader programs can be made much more interesting within the confines of the engine!

Render Textures
Another nifty feature is the addition of render textures! By essentially binding a “camera” object to a texture, it is possible to render the scene into that texture, instead of onto the screen! This texture can then be used like any other anywhere in the drawing process, which means it’s possible to do things like draw a realtime security camera monitor in the scene, or have a mirror with realtime reflections! This can get quite costly, so it is best used sparingly, but the addition of this feature opens the door to a wide variety of other cool effects!

With the addition of both cubemaps and render textures, I hope to get shadow-mapping working in the near future, which would allow objects to appropriately cast shadows when illuminated in the scene, which was previously infeasible!

And now, the boring stuff – HTML
The custom HTML tag system has been improved immensely, and now makes much more sense. Entity tags may now be nested to define object hierarchies, and arbitrary parameters can be provided as child-tags, rather than attributes. This generally makes the scene documents far more legible, and makes adding new features in the future much easier.

Here’s a “camera” object, for example.

<tsgl-entity id=”main_camera”>
<tsgl-component type=”camera”>
<tsgl-property type=”number” name=”fov” value=”80″></tsgl-property>
<tsgl-property type=”number” name=”aspect” value=”1.6″></tsgl-property>
</tsgl-component>
<tsgl-component type=”transform”>
<tsgl-property type=”vector” name=”position” value=”0 2 0″></tsgl-property>
</tsgl-component>
</tsgl-entity>

Previously, the camera parameters would have been crammed into a single tag’s attributes, making it much more difficult to read, and much more verbose. With the addition of tsgl-property tags, attributes of each scene entity can now be specified within the entity’s definition, so all of those nice editor features like code-folding can now be exploited!

This part isn’t exactly fun compared to the rendering tests earlier, but it certainly helps when attempting to define a scene, and add new features!

That’s all for now! In the meantime, you can check out the very messy and very unstable, tsGL on GitHub if you want to try it for yourself, or experiment with new features!

tsGL – An Experiment in WebGL

dragon_turntable

For quite some time now, I’ve been extremely interested in WebGL. Realtime, hardware-accelerated rendering embedded directly in a web-page? Sounds like a fantastic proposition. I’ve dabbled quite a bit in desktop OpenGL and have grown to like it, despite its numerous… quirks… so it seemed only natural to jump head-first into WebGL and have a look around!

So, WebGL!
I was quite surprised by WebGL’s ease of use! Apart from browser compatibility (which is growing better by the week!) WebGL was relatively simple to set up. Initialize an HTML canvas context, and go to town! The rendering pipeline is nearly identical to OpenGL ES, and supports the majority of the same features as well! If you have any knowledge of desktop OpenGL, the Zero-To-Triangle time of WebGL should only be an hour or so!

Unfortunately, WebGL must be extremely compatible if it is to be deployed to popular web-browsers. This means that it has to run on pretty much anything with a graphics processor, and can’t rely on the user having access to the latest and greatest technologies! For instance, when writing the first draft of my lighting code, I attempted to implement a deferred rendering pipeline, but quickly discovered that multi-target rendering isn’t supported in many WebGL instances, and so I had to fall back to the more traditional forward rendering pipeline, but it works nonetheless!

dabrovic_pan

A textured scene featuring two point lights, and an ambient light.

Enter TypeScript!
I’ve never been particularly fond of Javascript. It’s actually quite a powerful language, but I’ve always been uncomfortable with the syntax it employs. This, combined with the lack of concrete typing can make it quite difficult to debug, and quite early on I encountered some issues with trying to multiply a matrix by undefined. By the time I had gotten most of my 3D math working, I had spent a few hours trying to find the source of some ugly, silent failures, and had decided that something needed to be done.

I was recommended TypeScript early in the life of the project, and was immediately drawn to it. TypeScript is a superset of Javascript which employs compile-time type checking, and a more familiar syntax. Best of all, it is compiled to standard minified Javascript, meaning it is perfectly compatible with all existing browsers! With minimal setup, I was able to quickly convert my existing code to TypeScript and be on my way! Now, when I attempt to take the cross product of a vector, and “Hello World”, I get a nice error in the Javascript console, instead of silent refusal.

Rather than defining an object prototype traditionally,

var MyClass = ( function() {
function MyClass( value ) {
this.property = value;
}
this.prototype.method = function () {
alert( this.property );
};
return MyClass;
})();

One can just specify a class.

class MyClass {
property : string;
constructor( value : string ) {
this.property = value;
}
method () {
alert( this.property );
}
}

This seems like a small difference, and honestly it doesn’t matter very much which method you use, but notice that property is specified to be of type string. If I were to construct an instance of MyClass and attempt to pass an integer constant, a compiler error would be thrown, indicating to me that MyClass instead requires a string. This does not effect the final Javascript, but reduces the chance of making a mistake while writing code significantly, and makes it much easier to keep your thoughts straight when coming back to a project after a few days.

Web-Components!
When trying to decide on how to represent asset metadata, I eventually drafted a variant on XML, which would allow for simple asset and object hierarchies to be defined in “scenes” that could be loaded into the engine as needed. It only took a few seconds before I realized what I had just described was essentially HTML. From here I looked into the concept of “Web-Components” a set of prototype systems that would allow for more interesting UI and DOM interactions in modern web browsers. One of the shiny new features proposed is custom HTMLElements, which allow developers to define their own HTML tags, and associated Javascript handlers. With Google Chrome supporting these fun new features, I quickly took advantage.

Now, tsGL scenes can be defined directly in the HTML document. Asset tags can be inserted to tell the engine where to find certain textures, models, and shaders. Here, we initialize a shader called “my-shader”, load an OBJ file as a mesh, and construct a material referencing a texture, and a uniform “shininess” property.

<tsgl-shader id=”my-shader” vert-src=”/shaders/test.vert” frag-src=”/shaders/test.frag”></tsgl-shader>

<tsgl-mesh id=”my-mesh” src=”/models/dragon.obj”></tsgl-mesh>

<tsgl-material id=”my-material” shader=”my-shader”>
<tsgl-texture name=”uMainTex” src=”/textures/marble.png”></tsgl-texture>
<tsgl-property name=”uShininess” value=”48″></tsgl-property>
</tsgl-material>

We can also specify our objects in the scene this way! Here, we construct a scene with an instance of the “renderer” system, a camera, and a renderable entity!

<tsgl-scene id=”scene1″>
<tsgl-system type=”renderer”></tsgl-system>

<tsgl-entity>
<tsgl-component type=”camera”></tsgl-component>
<tsgl-component type=”transform” x=”0″ y=”0″ z=”1″></tsgl-component>
</tsgl-entity>

<tsgl-entity id=”dragon”>
<tsgl-component type=”transform” x=”0″ y=”0″ z=”0″></tsgl-component>
<tsgl-component type=”renderable” mesh=”mesh_dragon” material=”mat_dragon”></tsgl-component>
</tsgl-entity>
</tsgl-scene>

Entities can also be fetched directly from the document, and manipulated via Javascript! Using document.getElementById(), it is possible to obtain a reference to an entity defined this way, and update its components! While my code is still far from production-ready, I quite like this method! New scenes can be loaded asynchronously via Ajax, generated from web-servers on the fly, or just inserted into an HTML document as-is!

Future Goals
I wanted tsGL to be a platform on which to experiment with new web-technologies, and rendering concepts, so I built it to be as flexible as possible. The engine is broken into a number of discreet parts which operate independently, allowing for the addition of cool new features like rigidbody physics, a scripting interface, or whatever else I want in the future! At the moment, the project is quite trivial, but I’m hoping to expand it, test it, and optimize it in the near future.

At the moment, things are a little rough around the edges. Assets are loaded asynchronously, and the main context just sits and complains until they appear. Rendering and updating operate on different intervals, so the display buffer tears like tissue paper. OpenGL is forced to make FAR more context switches than necessary, and my file parsers don’t cover the full format spec, but all in all, I’m quite proud of what I’ve managed to crank out in only ten or so hours of work!

If you’d like to check out tsGL for yourself, you can download it from my GitHub page!