Hi Royce:
int highest_bit ( int i )
{
int ret = 0;
if ( i > 0xffff )
i >>= 16, ret = 16;
if ( i > 0xff )
i >>= 8, ret += 8;
if ( i > 0xf )
i >>= 4, ret += 4;
if ( i > 0x3 )
i >>= 2, ret += 2;
return ret + (i>>1);
}
Guys I think I have one of those optimizations manuals for asm that actually use only one multiplication
by a called ¨magic number¨ to get the first bit set. I actually don´t want to trace this code in my head
but if you are searching a fast BSF replacement... It was tested using brute force for the 2^32 possible values
on the argumment. Maybe you could check the internet for ¨Agner Fog optimization¨. I have a very bad
memory so maybe the name is wrong. This is all the info I can give until
I get home. Ohh well i can do it myself: http://www.df.lth.se/~john_e/gems/gem0039.html It was Harley´s magic number.
so I wasn´t that far.
or you can read it here... I have a c version but I guess you don´t need that :)
----------------------------------------------------------------------------------------------------------
Fast BSF replacement ASM/80386
Macro EMBSF5 emulates the BSF instruction for non-zero argument <------------------------------------ remember to test for this!!!!!!!
This macro utilizes an algorithm published in the news group comp.arch by Robert Harley in 1996. The algorithm converts the general problem of finding the position of the least significant set bit into a special case of counting the number of bits in a contiguous block of set bits. By computing x^(x-1), where x is the original input argument, a right-aligned contiguous group of set bits is created, whose cardinality equals the position of the least significant set bit in the original input plus 1.
The input x is of the form (regular expression): {x}n1{0}m. x-1 has the form {x}n{0}(m+1), and x^(x-1) has the form {0}n{1}(m+1). This step is pretty similar to the one used by macro PREPBSF <http://www.df.lth.se/~john_e/gems/gem002e.html> , only that PREPBSF <http://www.df.lth.se/~john_e/gems/gem002e.html> creates a right-aligned group of set bits whose cardinality equals exactly the position of the least significant set bit.
Harley's algorithm then employs a special method to count the number of bits in the right-aligned contiguous block of set bits. I am not sure upon which number theoretical argument it is founded, and it wasn't explained in the news group post.
According to Harley, if a 32-bit number of the form 00...01...11 is multiplied by the "magic" number (7*255*255*255), then bits <31:26> of the result uniquely identify the number of set bits in that number. A 64 entry table is used to map that unique result to the bit count. Here, I have modified the table to reflect the bit position of the least significant set bit in the original argument, which is one less than the bit count of the intermediate result.
I have tested the macro EMBSF5 exhaustively for all 2^32-1 possible inputs, i.e. for all 32-bit numbers except zero.
Place the following table in the data segment:
table db 0, 0, 0,15, 0, 1,28, 0,16, 0, 0, 0, 2,21,29, 0
db 0, 0,19,17,10, 0,12, 0, 0, 3, 0, 6, 0,22,30, 0
db 14, 0,27, 0, 0, 0,20, 0,18, 9,11, 0, 5, 0, 0,13
db 26, 0, 0, 8, 0, 4, 0,25, 0, 7,24, 0,23, 0,31, 0
And here follows the actual macro:
;
; emulate bsf instruction
;
; input:
; eax = number to preform a bsf on ( != 0 )
;
; output:
; edx = result of bsf operation
;
; destroys:
; ecx
; eflags
;
MACRO EMBSF5
mov edx,eax ; do not disturb original argument
dec edx ; n-1
xor edx,eax ; n^(n-1), now EDX = 00..01..11
IFDEF FASTMUL
imul edx,7*255*255*255 ; multiply by Harley's magic number
ELSE
mov ecx,edx ; do multiply using shift/add method
shl edx,3
sub edx,ecx
mov ecx,edx
shl edx,8
sub edx,ecx
mov ecx,edx
shl edx,8
sub edx,ecx
mov ecx,edx
shl edx,8
sub edx,ecx
ENDIF
shr edx,26 ; extract bits <31:26>
movzx edx,[table+edx] ; translate into bit count - 1
ENDM
Note: FASTMUL can be defined if your CPU has a fast integer multiplicator, like the AMD. The IMUL replacement should run in about 8-9 cycles.
Gem writer: Norbert Juffa <mailto:norbert.juffa@amd.com>
last updated: 1999-09-02
----------------------------------------------------------------------------------------------------------
>> updated micro-code << Wow where I can get this?
Do you have an assembler/disassembler?
I once asked Tigran Aizvian but he told me not to have the information at all.
With that I could save myself some time reversing it by trial and error using the black box approach.
Someday. As far as I know still Intel CPUs microcode updates are not digitally signed
so it could be reprogrammed.
Here's updated micro-code for BSR, if you really think AMD & Intel will
actually do it...
IF r/m = 0
THEN
ZF := 1;
register := UNDEFINED;
ELSE
ZF := 0;
register := 0;
temp := r/m;
IF OperandSize = 32 THEN
IF (temp & 0xFFFF0000) != 0 THEN
temp >>= 16;
register |= 16;
FI;
FI;
IF (temp & 0xFF00) != 0 THEN
temp >>= 8;
register |= 8;
FI;
IF (temp & 0xF0) != 0 THEN
temp >>= 4;
register |= 4;
FI;
IF (temp & 0xC) != 0 THEN
temp >>= 2;
register |= 2;
FI;
register |= (temp>>1);
FI;
In copyright law it's the result that matter, not the process. To be
protected by copyright law, it must be in tangible form. Looking at
disassembled code is permitted by copyright law so it's only a problem
if the person doing it lacks self-discipline and uses the disassembled
code in a way that violates copyright law.
That said, I don't believe there is a need for reverse engineering
methods for implementing the majority of ReactOS and WINE. Our goal
is to run Windows software on ReactOS and other free operating systems.
We have some smart people on these projects. If the Windows software
vendors could figure out how to use the interfaces using the public
available documentation, then we can certainly implement the interfaces
from that same information. If there is an interface that isn't
documented, then certainly most Windows software won't use it
and there is no need to implement the interface at all.
Casper
References:
>From http://www.nus.edu.sg/intro/slides/021111_1_Wilson%20Wong.ppt
What is protectable?
Functionality?
- not protected
- must be an "expression...of a set of instructions"
- Autodesk Inc v Dyason (Aust): copying function of AutoCAD lock but not code (no infringement)
Program Structure?
- (cf: plot elements in literary/dramatic works capable of protection)
- eg. structural arrangements of modules and subroutines
- yes if expression of idea, & no if ideas or functions: difficult line to draw
- yes if organisation, sequence, arrangement or compilation
- not protectable: elements of software which are:
* ideas;
* dictated by efficiency or external factors (eg h/w) specs, compatibility reqms, industry standards; or
* public domain
- not if similarity is due only to similar subject matter
* analogy: drawing of a hand
* eg any 2 word processors will have some structural similarities
> -----Original Message-----
> From: wine-devel-admin(a)winehq.org [mailto:wine-devel-admin@winehq.org] On Behalf Of Mike McCormack
> Sent: 25. marts 2005 10:39
> To: Jonathan Wilson
> Cc: wine-devel(a)winehq.org; ros-dev(a)reactos.com
> Subject: Re: I think I know how uxtheme works...
>
>
> Jonathan Wilson wrote:
> > What I am doing here is clean-room reverse engineering.
>
> There's no reason that we need to expose ourselves to any legal risk at
> all. It may be your opinion that reading assembly code and describing
> it is legal and safe, but I don't agree and I'm sure others who work on
> Wine don't either.
>
> The original IBM BIOS was clean room reverse engineered because there
> was not enough documentation available for it. We have MSDN, which is
> not 100% complete, but is good enough for most purposes. Other things
> can be deduced from test programs. Information that cannot be deduced
> by a test program is irrelevant.
>
> Reverse engineering does not help Wine. Casual observers will assume
> that Wine is the result of reverse engineering by disassembly, which is
> incorrect and harmful.
>
> Mike
gvg(a)svn.reactos.com wrote:
>Build freeldr as elf executable, initial Xen support
>
>Added files:
>...
>
>Updated files:
>...
>branches/xen/reactos/boot/freeldr/freeldr/fs/ntfs.c
>...
>
>
Just to let you know, there's no need to change the wide-character
string literals. You can use the -fshort-wchar GCC switch to get the
desired behaviour without touching the sources...
- Filip
Hi Hartmut,
--- Hartmut Birr <hartmut.birr(a)gmx.de> wrote:
> with my last changes (r14296), coLinux does start and does mount the
> root partition. It exist an second problem. The coLinux window is only
> shown for a very short time. I'm not so familiar with the gdi, user and
> win32k code. I cannot fix the second problem.
Nice! I will svn update, rebuild and test tommrow.
Thanks
Steven
__________________________________
Do you Yahoo!?
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/
What I am doing here is clean-room reverse engineering.
Essentially it involves taking existing binary modules and sources of
information and writing up what they do in a way that isnt violating the
copyright on the origonal (IANAL but I dont think what I posted violates
the copyright on the windows binaries).
Then it involves someone else who hasnt seen the origonal copyrighted
source using the resulting information to re-implement whatever the
information describes.
As long as the person who origonally did the reverse engineering never
contributes anything to the resulting code, it is (AFAIK, IAMAL) perfectly
legal (after all, this is exactly what was done when the first clone PC
bios was written, one group disassembled the origonal IBM BIOS and wrote up
documents on how it works and how to talk to it. Then another group not
connected to the people who did the origonal reverse engineering took the
documentation and re-implemented the BIOS from just the documentation
without looking at the orgininal IBM implementation)
I never had any plans to actually write code for this (since doing so is
out of my skill range) and as long as I never do so in the future (thus
"tainting" the code I write as being a potential Derived Work of the
microsoft code I looked at), I dont see any way that the results of this
investigation could be considered a derived work of any microsoft code.
Although if I have a mis-understanding of this situation, please point out
what I am doing that would make code written by someone else using the
information in my posts a Derived Work of microsofts origonal binaries.
Hello,
I'd like to provide a small VS.NET project file with 5 different tests.
- 46ffffe9 tells everything is alright with the function calculation
- times are measured with QueryPerformanceCounter
- loop run cnt: 0x2ffffff
result orig function 46ffffe9
it took 1491052
result orig function inlined 46ffffe9
it took 1035547
result second proposal inlined 46ffffe9
it took 1244434
result optimized asm 46ffffe9
it took 1338367
result debug asm 46ffffe9
it took 8774815
The second proposal is the original proposal but more shifts - still slower
tho.
Interesting is the inlined version generated by MSVC, shaving off almost 1/3
of the overall time.
Also stared as "optimized asm" - no gurantee on register safety tho ;)
For portability and performance sake it should be considered to create a
compiler macro.
This function is terribly small, any optimisations inside are outweighted by
the calling overhead in this case.
The most impressive one is the original function inlined, althought the ASM
would only work on x86.
Please do not think about using 64k tables, thats what, 1/2 of a Sempron L2
cache?
It would really really trash performance.
Available at: <a href="http://hackersquest.org/kerneltest.html"
title="Kernel Test Emulator">Kernel Test</a>
<a
href="http://wohngebaeudeversicherung.einsurance.de/">http://wohngebaeudeversicherung.einsurance.de</a>
Hi
Why not create two diffrent changelog for each release.
one that contain all svn commit msg
next one is what is benefit for the user.
I think this is good idea. ´Then the user
can read what new for them. If some want really
know what we have done. Then they can read all
commit msg. I think this is good idea, and will benfit
alot of people.
I would think that ReactOs is not anywhere near being ready for themeing partially because it needs to write much more of the system software. Themeing can always be an option later. I have mentioned an alternate filesystem, and other services, but since I have not submitted code, people on the mailing list think i can not program something of the sort. I think it is great that people are wanting to understand the process of themeing but keep in mind that loading a theme may cut down on the resources available to the opertating system or may interfere with someone elses environment.
>
> From: Jonathan Wilson <jonwil(a)tpgi.com.au>
> Date: 2005/03/23 Wed PM 09:19:23 EST
> To: wine-devel(a)winehq.org, ros-dev(a)reactos.com
> Subject: [ros-dev] I think I know how uxtheme works...
>
> What I strongly suspect happens is this:
> 1.The themes service is always loaded and running and holds the theme data
> (and it contains details of which .msstyles file to use)
> 2.At some point (either at startup if the changes are global or when an app
> loads if the changes are app specific, I cant find where this bit happens
> but I suspect the themes service does this) control passes to ordinal 34 in
> uxtheme.dll (aka ThemeHooksInstall acording to uxtheme.pdb). This function
> calls an undocumented function in user32 called RegisterUserApiHooks (which
> appears to be taylor made for uxtheme to do what it needs to do).
> The function hooks:
> GetScrollInfo
> SetScrollInfo
> EnableScrollBar
> SetWindowRgn
> DefWindowProcW
> DefWindowProcA
> PreWndProc (probobly stuff that happens internally before the window
> procedure gets called)
> PostWndProc (probobly stuff that happens internally after the window
> procedure gets called)
> PreDefDlgProc (probobly connected to dialog boxes)
> PostDefDlgProc (probobly connected to dialog boxes)
> GetSystemMetrics
> SystemParametersInfoA
> SystemParametersInfoW
> DrawFrameControl
> DrawCaption
> MDIRedrawFrame (probobly related to MDI frame windows)
> This is what causes all apps (including non-theme-aware ones) to get themed
> non-client areas (window borders etc)
> 3.Apps that are theme aware have a manifest, load comctl32.dll 6.0 and
> probobly have to call InitCommonControlsEx (although I think some of the
> code in the dllmain for comctl32.dll 6.0 may also end up calling
> InitCommonControlsEx). This causes new versions of the stock classes
> (button, list box, edit, combo box etc etc etc) to be created and
> registered. A cursory examination of some of these classes shows that they
> dont seem to "pull code" from the stock implementation in user32.dll (i.e.
> anything they dont themselves handle goes back to DefWindowProc). In
> particular, it doesnt appear as though they have any knowledge of the
> classes or window procedures contained in user32.dll.
> then 4.Apps that need to do something extra call into functions exported by
> uxtheme.dll to do whatever they need to do.
>
> user32.dll seems to have no knowledge whatsoever about themeing and
> uxtheme. All that user32 has is RegisterUserApiHooks (and the matching
> UnregisterUserApiHooks) that uxtheme uses to hook into user32.dll and do stuff.
>
> Exactly how the theme service and theme engine works "under the hood"
> doesnt matter.
> But for ReactOS and WINE purposes, I suggest we implement the following:
> 1.A function similar to RegisterUserApiHooks/UnregisterUserApiHooks (either
> reverse engineer the MS function or write our own). This will allow uxtheme
> to hook into the global drawing for things like window borders without
> user32 needing to care about uxtheme and what uxtheme does (just like how
> MS hooks into user32 that way)
> 2.Whatver code is necessary to get the hooks installed right so that all
> apps get the "global" theming like on real windows
> 3.Support to read the manifest and load a new comctl32.dll that would be
> like the version 6 from microsoft (and would contain complete theme-aware
> implementations of the stock controls just like MS comctl32.dll 6.0 does)
> and 4.All the needed uxtheme exports to make stuff run
>
> This would be the same way Microsoft does things.
> User32.dll would have no idea whatsoever about uxtheme and theming
> All apps would get non-client-area themeing like on windows.
> And only theme aware apps would get themed controls etc (the rest would get
> "stock" windows controls)
>
> Oh and btw, if this "reverse engineering" is not "clean room" enough for
> WINE and ReactOS, let me know and I will stop posting it :)
>
>
> _______________________________________________
> Ros-dev mailing list
> Ros-dev(a)reactos.com
> http://reactos.com:8080/mailman/listinfo/ros-dev
>
Hi,
with my last changes (r14296), coLinux does start and does mount the
root partition. It exist an second problem. The coLinux window is only
shown for a very short time. I'm not so familiar with the gdi, user and
win32k code. I cannot fix the second problem.
- Hartmut