RacorX14The next example program is an optimized and improved version of RacorX13. The following paragraphs show the different pixel shader interim solutions I had. My first goal was to free one of the texture stages occupied by the attenuation map. Therefore I fetched the attenuation map in r4 twice: ; t0 - coordinates color map ; t1 - light vector ; t2 - half vector ; t3 - unused ; t4 - coordinates 2D attenuation map ; t5 - coordinates 1D attenuation map ps.1.4 texld r1, t1 ; cube map texld r0, t0 ; normal map texcrd r3.rgb, t2 ; half angle vector texld r4, t5 ; attenuation 1D map dp3 r1.rgb, r1_bx2, r0_bx2 ; diffuse dp3 r5.rgb, r3, r0_bx2 ; specular mov r0.r, r4.a phase texld r2, t0 ; color map + specular map texld r3, r5 ; samples specular value from specular map texld r4, t4 ; attenuation 2D map add r0, 1-r4, -r0.r ; (1.0 - 2Dvalue) - dest mul r1.a, r2.a, r3.a ; (N dot H)^16 * gloss value +mul r1.rgb, r1, r2 ; diffuse * color add r5, r1.a, r1 ; ((N dot H)^16 * gloss value) + (diffuse * color) mul r0, r5, r0 ; attenuation Fetching r4 twice costs an additional mov instruction but reduces the number of used texture stages to five. The newly available sixth stage was used to set the same cube normalization map in r5 to normalize the half vector: ; t0 - coordinates color map ; t1 - light vector ; t2 - half vector ; t3 - unused ; t4 - coordinates 2D attenuation map ; t5 - coordinates 1D attenuation map ps.1.4 texld r1, t1 ; cube map for light vector texld r0, t0 ; normal map texld r5, t2 ; cube map for half angle vector texld r4, t5 ; attenuation 1D map dp3 r1.rgb, r1_bx2, r0_bx2 ; diffuse dp3 r5.rgb, r5, r0_bx2 ; specular mov r0.r, r4.a phase texld r2, t0 ; color map texld r3, r5 ; samples specular value from specular map texld r4, t4 ; attenuation 2D map add r0, 1-r4, -r0.r ; (1.0 - 2Dvalue) - dest mul r1.a, r2.a, r3.a ; (N dot H)^16 * gloss value +mul r1.rgb, r1, r2 ; diffuse * color add r5, r1.a, r1 ; ((N dot H)^16 * gloss value) + (diffuse * color) mul r0, r5, r0 ; attenuation This example program sets the same cube map twice in r1 and r5. The next shader fetches the cube normalization map from one texture stage twice, first with the half vector and later with the light vector: ; t0 - coordinates color map ; t1 - light vector ; t2 - half vector ; t3 - unused ; t4 - coordinates 2D attenuation map ; t5 - coordinates 1D attenuation map ps.1.4 texld r0, t0 ; normal map texld r5, t2 ; cube map for half angle vector texld r4, t5 ; attenuation 1D map dp3 r5.rgb, r5, r0_bx2 ; specular mov r1.r, r4.a phase texld r2, t0 ; color map texld r3, r5 ; samples specular value from specular map texld r4, t4 ; attenuation 2D map texld r5, t1 ; cube map for light vector dp3 r5.rgb, r5_bx2, r0_bx2 ; diffuse add r4, 1-r4, -r1.r ; (1.0 - 2Dvalue) - dest mul r1.a, r2.a, r3.a ; (N dot H)^16 * gloss value +mul r1.rgb, r5, r2 ; diffuse * color add r3, r1.a, r1 ; ((N dot H)^16 * gloss value) + (diffuse * color) mul r0, r3, r4 ; attenuation To use the same cube map twice, parts of the dp3 instruction for the diffuse reflection had to be moved into the second phase, to be able to re-arrange the temporary registers rn. I think RacorX14 produces a better visual appearance than RacorX13, but I cheated a little bit to get the effect. If you take a closer look into the calculation of the first dp3 instruction you can see the cheat. The cube normalization map returns values from -1..1 and is usually biased and scaled to 0..1 via a _bx2 modifier in the pixel shader. I did not add this modifier to the r5 register in this instruction. If you add this modifier to this dp3 instruction, you will see an ugly banding effect. This happens because of the low level of color precision of some graphics hardware. Leaving the modifier away distores the result of the dp3 instruction and leads to this nice looking specular effect. A few more elegant solutions to this problem are presented in the paragraph "Improvements" below. I haven't tried a -8..8 cube normalization map, possible with ps.1.4 capable graphic cards. |