Jose Catena schrieb:
A correction of my previous msg:
In asm I would write the loop as: mov eax, iColor mov ebx, pulLine mov edx, cy L1: mov edi, ebx mov ecx, _cx rep stosd add ebx, lDelta dec edx jnz l1
It is not possible to optimize the loop further AFAIK, and this only saves a cmp and jnz in the outer loop, a tiny gain.
That looks almost like the version I wrote, except I didn't use memory access inside the loop, but registers, which should saves some cycles.