A correction of my previous msg:
In asm I would write the loop as: mov eax, iColor mov ebx, pulLine mov edx, cy L1: mov edi, ebx mov ecx, _cx rep stosd add ebx, lDelta dec edx jnz l1
It is not possible to optimize the loop further AFAIK, and this only saves a cmp and jnz in the outer loop, a tiny gain.
Jose Catena DIGIWAVES S.L.