Maximize the output of your CPU (Keep your CPU in full power mode)

Recently I am working on UI decoding optimization. I found this program, Full Throttle Override, is very useful, and it can fully release the power of your CPU.

To balance of power consumption and performance, almost all x86 CPUs support either Cool’n’Quiet or SpeedStep or PowerNow! technology, which can dynamically adjust the CPU frequency based on the loading.

I found it’s pretty easy to implement Full Throttle Override and here is the core C++ code

		
void FullThrottle()
{
    OSVERSIONINFO osvi;
    memset(&osvi, 0, sizeof(OSVERSIONINFO));
    osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
    GetVersionEx(&osvi);
    // For Vista and above
    if (osvi.dwMajorVersion >= 6)
    {
	GUID *scheme;
	PowerGetActiveScheme(NULL, &scheme);
	PowerWriteACValueIndex(NULL
            , scheme
            , &GUID_PROCESSOR_SETTINGS_SUBGROUP
            , &GUID_PROCESSOR_THROTTLE_MINIMUM
            , 100);
	PowerSetActiveScheme(NULL, scheme);
    }
    else
    {
	MessageBox(NULL, L"Not supported by your OS!",L"",0);
    }
}

Feel the power of parallel computing (OpenMP)

These two weeks, I am working on our product UI side to improve the performance of animation rendering. Previously, there is only one single thread to decode the animation line by line, and it takes around 50ms for the whole frame.

org

Now, I change the way of rendering, and let all lines parallel decode to fully take advantage of modern multi-core CPU.

new

 

Visual Studio natively supports OpenMP, it gives me a easy way to access this powerful tool.

To set this compiler option in the Visual Studio development environment

  1. Open the project’s Property Pages dialog box. For details, see How to: Open Project Property Pages.
  2. Expand the Configuration Properties node.
  3. Expand the C/C++ node.
  4. Select the Language property page.
  5. Modify the OpenMP Support property.

After some simple code update, surprisingly, I found that my frame decoding performance boosts 950% (almost 10 times faster), from 8 FPS to 76 FPS!

 

Let’s do simple test with the following code:

#define TEST_LENGTH 0x3fffffff

double mptest()
{
    LARGE_INTEGER  large_interger;
    double dff;
    __int64  c1, c2;
    QueryPerformanceFrequency(&large_interger);
    dff = large_interger.QuadPart;
    //
    unsigned char *test = new unsigned char[TEST_LENGTH];
    QueryPerformanceCounter(&large_interger);
    c1 = large_interger.QuadPart;
    #pragma omp parallel for
    for (int i = 0; i<TEST_LENGTH; i++)
    {
        test[i] = rand();
    }
    QueryPerformanceCounter(&large_interger);
    c2 = large_interger.QuadPart;
    delete test;
    return (c2 - c1) * 1000.0f / dff;
}

double test()
{
    LARGE_INTEGER  large_interger;
    double dff;
    __int64  c1, c2;
    QueryPerformanceFrequency(&large_interger);
    dff = large_interger.QuadPart;
    //
    unsigned char *test = new unsigned char[TEST_LENGTH];
    QueryPerformanceCounter(&large_interger);
    c1 = large_interger.QuadPart;
    for (int i = 0; i<TEST_LENGTH; i++)
    {
        test[i] = rand();
    }
    QueryPerformanceCounter(&large_interger);
    c2 = large_interger.QuadPart;
    delete test;
    return (c2 - c1) * 1000.0f / dff;
}

int _tmain(int argc, _TCHAR* argv[])
{
    printf("Random generation cost with MP %lfmsn", mptest());
    printf("Random generation cost without MP %lfmsn", test());
    _getch();
    return 0;
}

Look at the huge difference!

result

Microsoft .NET Framework 4.x Redistributable Installer Link

Sometimes, it’s hard to find Microsoft .Net framework 4.x installer link.

Here is my collections for the link:

 

.NET Framework version Redistributable installation
4.6 Preview Download page for 4.6 web installer
Download page for 4.6 offline installer
4.5.2 Download page for 4.5.2 web installer
Download page for 4.5.2 offline installer
4.5.1 Download page for 4.5.1 web installer
Download page for 4.5.1 offline installer
4.5 Download page for 4.5 web installer
Download page for 4.5 offline installer
4 Download page for 4 web installer
Download page for 4 offline installer
4 Client Profile Download page for 4 Client Profile web installer
Download page for 4 Client Profile offline installer

 

Download a hotfix without contacting Microsoft?

I am working on Windows 7 Embedded recently, and need some hotfix, which is not publicly released by Microsoft. I found this article very useful.

How can you download a hotfix without contacting Microsoft?

Here is the tricks:

A customer can get the fix they want without calling in to Microsoft, assuming they know the KB number of the hotfix they want and can remember the URL format for a self-service hotfix request:

http://support.microsoft.com/hotfix/KBHotfix.aspx?kbnum=KBNumber&kbln=KBLanguage

 

My first x64 assembly code cooked by hand

This is my first hand made x64 assembly code.

extrn MessageBoxA:proc

.DATA
CONST SEGMENT
    msg DB "Hello World!", 0
CONST ENDS

.CODE
main PROC
    sub rsp, 28h
    xor rcx, rcx
    lea rdx, msg
    lea r8, msg
    xor r9, r9
    call MessageBoxA
main ENDP
END

To compile this code, in command prompt run

ml64 helloworld.asm /link /subsystem:windows /defaultlib:user32.lib /entry:main

The major different between x86 and x64 calling conventions. The first six integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9, while XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6 and XMM7 are used for floating point arguments. additional arguments are passed on the stack and the return value is stored in RAX.