I use to enjoy when I do Win32 API Programming. So far I have worked on it without any formal learning. I think this is high time to make my understanding neat and clean on Win32. Here is my attempt to start summarizing the essentials of Win32 Programming which I enjoy. I am going to post Win32 basics in this blog. This is not going to be a complete Win32 learning plot, but it will definitely help myself and others to brush up the existing knowledge in Win32. I am not going to cover the deep root topics, since I don’t know much of them. Charles Petzold’s “Programming Windows” is the good book to start. I prescribe that book for any one who start learning Win32 programming.
Thanks to my friend Mr.Suresh, who was conducting Win32 classes for freshers in our office. I attended a session, seems to be useful for me also. Suresh is strong in Win32, he will help me in making this series useful to everyone. I welcome the readers to comment on mistakes if any in the postings..!
After all sharing knowledge is nothing but learning!
16/32 bit programming
Before knowing about programming, let us see what is 16 bit or 32 bit all about. The 16 or 32 bits, we talk about is nothing but the data bus width supported by the microprocessor. If a microprocessor can fetch 32 bits (4 bytes) of data during a memory read operation, it is called 32 bit microprocessor.
Intel followed 16 bit architecture till 80286 processor. The memory mode for this architecture is called Segmented memory mode. In this mode, each memory address is specified as segment and offset. It was a bit messy for programmers to handle these segment – offset concept. Here int data type in C, is 16 bits in size (2 bytes). Windows 1.0 to 3.1 followed this, which resulted in Win16 API.
From 80386 processor, Intel switched over to 32 bit architecture. It follows Linear memory model. No segment – no offset. C int data type has become 32 bits (4 bytes) and pointers are all 32 bits. Neat and clean. Starting from Windows 95 and NT 3.1, windows became 32 bit operating system which runs on 80386 and higher versions. This resulted in Win32 API.
What is Windows API?
|Windows Kernel and Device drivers|
Application Programming Interface (API) is the library offered by the Operating system to its developers to develop an application to run on this OS. Through API, a user mode program can access services offered by OS. There is a layer called System Call Interface, through which traditional UNIX programmers use to access OS services. Then why should we want Operating System Level API in Windows? Unlike UNIX, Windows abstract the many system call interfaces in to a single API call to make the programmer’s job easy, so that they can spend more time on forwarding emails and reading a round robbin mail for 10th time in a year). Apart from this, Windows has Graphical User Interface (GUI) as part of kernel. The graphical screen elements (Windows, buttons, menus, etc.) are simplified through API. This makes the programmers need not to worry about the finite element which is not related to the end user’s requirements. It is simple to call a function like CreatWindow() instead of calling many open,read and write system calls to create a window in screen. Since the API is implmented by the operating system, it knows the better way of calling system call interfaces.
Thank God, some how I have managed to justify my programming work on Windows API. 🙂
Concepts of Windows
The conventional console based programs (DOS & UNIX based) used to call the system calls such as read() to read a data from keyboard through some library. (e.g.): The device driver of keyboard stores the actual data (key pressed) in system buffer. A library (stdio) collects this data and passes to a program when a program calls read() system call. i.e a program reads the keyboard buffer when it is required. OS does not convey any message that a character has arrived from keyboard. The following flow explains this in simple form.
Keyboard driver : Keyboard —–> A key is pressed ——> data is stored in system bufferApplication Program : read() —–> if data is in keyboard buffer —–> data is read by application
In Windows OS, this scenario is different. OS makes call to application program. Let me put it in detail. (e.g.): Whenever a key is pressed in a keyboard, Windows (OS) store that event (e.g: Key Pressed or Released) and data in the system message queue. OS analyses that message and identifies the window belonging to that message. From this information, it can identify the process (thread) which created this window. OS then passes on this event to this thread’s message queue. This thread runs a loop processing the incoming messages. i.e OS conveys the message to an application. The following flow explains this in simple form. Note: An application thread may have more than one window. So a window does not mean an application, it means one of the windows of that application.
Keyboard driver : Keyboard —–> A key is pressed —-> this event and data are stored in system message queue (OS level)
Windows (OS) : Identifies window of the message —-> Identify the corresponding thread —-> Post the message to that thread’s message queue (Application level).
Application (thread) : Get the message —-> Translate the message (converting virtual key codes into actual characters) —–> Post back to thread’s message queue —-> Dispatches message to corresponding window procedure by calling that procedure.
Window Procedure : Identify the message type and act on that —-> complete and returns to application thread to retrieve the next message.
DispatchMessage() is a function implemented by the OS, which is called by the application thread. So the window procedure of an application is called by the OS!!! It is really strange from conventional programming.
Types of Keystroke messages
|Nonsystem Keystroke: (Handled by WindowProc)||WM_KEYDOWN||WM_KEYUP|
|System Keystroke: (Combination of Alt key which is handled by DefWindowProc)||WM_SYSKEYDOWN||WM_SYSKEYUP|
Scan Code: The real keyboard sends this code based on the key pressed. Hence it is too much device dependent.
Virtual Key Code: Windows has defined these codes for all keys. This is device independent. For number and alphabets the Virtual Key Codes are same as that of ASCII codes. Rest of the keys are defined in windows with different key codes.
The wParam of keystroke message contains Virtual Key Code of the key pressed. The lParam contains the following information:
1. Repeat Count – No. of keystrokes represented by the message.
2. OEM Scan Code – Not used in Windows.
3. Extended Key Flag – Represents if any additional key pressed such as right side Alt, Ctrl keys.
4. Context Code – Alt key is depressed or not.
5. Previous Key Status – This key was previously UP or DOWN.
6. Transition State – Key is being pressed or released.
GetKeyState() : This API let us to know about the status of Shift, Alt, Ctrl, Caps Lock, Num Lock & Scroll Lock keys.
|Types of Character messages||Characters||Dead Characters|
The wParam of character message contains ANSI / UNICODE Code of the key pressed. The lParam contains the information same as that of keystroke messages. IsWindowUnicode() is the API used to determine whether a window supports the UNICODE or not.
Message ordering example: Typing uppercase A by using Shift key
|Message||Key or Code|
|WM_KEYDOWN||Virtual key code VK_SHIFT (0×10)|
|WM_KEYDOWN||Virtual key code for `A’ (0×41)|
|WM_CHAR||Character code for `A’ (0×41)|
|WM_KEYUP||Virtual key code for `A’ (0×41)|
|WM_KEYUP||Virtual key code VK_SHIFT (0×10)|
Dead character messages: On some non-US English keyboards (like German), a diacritic concept is there. Diacritic does not generate character by themselves. They are used in combination with a letter to make a character. This key message is called dead character message.
GetSystemMetrics(SM_MOUSEPRESENT) is used to determine if a mouse is present or not.
Unlike keyboard messages, a window procedure receives mouse event messages even if it is inactive or does not have the input focus. CS_DBLCLKS flag should be set in window class to enable the window to receive mouse double click message.
Capturing the mouse outside the window: SetCapture (hwnd) ;
To release the captured mouse: ReleaseCapture () ;
Client Area Messages usually handled by the window proc:
|Button||Pressed||Released||Pressed (Second Click)|
Non-Client Area (menu, title bar, scroll bar) Messages usually handled by the DefWindowProc:
|Button||Pressed||Released||Pressed (Second Click)|
The Microsoft IntelliMouse protocol supports WM_MOUSEWHEEL message which enables a little wheel in between two of the mouse buttons.
SetTimer(): Sets the time out interval between 1 msec to 50 days (approx.). For every time out, window proc receives a WM_TIMER message.
KillTimer(): Stops the timer message. It purges any pending WM_TIMER message in queue.
The Timer messages are not asynchronous. The WM_TIMER messages are queued in normal message queue. The window proc may miss a WM_TIMER message when it is busy. The time set though SetTimer() is not accurate since the resolution of time depends on hardware resolution.
There are two major methods to implement a timer.
1. SetTimer() to set the timer interval and processing the WM_TIMER in window proc.
2. SetTimer() to set the timer interval and attach a callback function to this timer. Now the WM_TIMER messages will be redirected to this callback function instead of window proc.
There are two types of dialog boxes:
1. Modal dialog
2. Modeless dialog
When a program displays a modal dialog box, the user cannot switch between the dialog box and another window in your program. The user must explicitly end the dialog box. The user can, however, switch to another program while the dialog box is still displayed. Some dialog boxes (called “system modal”) do not allow even this. System modal dialog boxes must be ended before the user can do anything else in Windows.
Modeless dialog boxes allow the user to switch between the dialog box and the window that created it as well as between the dialog box and other programs.
Some of the common dialog boxes available in the Windows for uniform user interface across applications. They are listed below:
GetOpenFileName() – Displays a Open File dialog box
GetSaveFileName() – Displays a Save File dialog box
ChooseFont() – Displays a Font selection dialog box
FindText() and ReplaceText() – Displays a string finding and replacing modeless dialog box
Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.
A thread is the entity within a process that can be scheduled for execution. All threads of a process share its virtual address space and system resources. In addition, each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifier, and a set of structures the system will use to save the thread context until it is scheduled. The thread context includes the thread’s set of machine registers, the kernel stack, a thread environment block, and a user stack in the address space of the thread’s process. Threads can also have their own security context, which can be used for impersonating clients.
Windows supports preemptive multitasking, which creates the effect of simultaneous execution of multiple threads from multiple processes. On a multiprocessor computer, the system can simultaneously execute as many threads as there are processors on the computer.
CreateProcess() : Create a Process with the attributes set. GetPriorityClass() and SetPriorityClass() can be used to process Priority of the created Process. The Priority class can be one of the following:
The following are priority levels within each priority class. GetThreadPriority() and SetThreadPriority() are related APIs. The levels are THREAD_PRIORITY_IDLE / LOWEST / BELOW_NORMAL / NORMAL (default) / ABOVE_NORMAL, HIGHEST and TIME_CRITICAL.
By default child process By default, a child process inherits a copy of the environment block of the parent process.
CreateThread() : Create a Thread with the attributes set. Priority levels can be set as discussed above. By default threads are created in executable state. If we use CREATE_SUSPENDED flag while creating a thread, the thread will not be executed immediately. We need to issue ResumeThread() to bring it back to executable state. ExitThread() will terminate a thread in order. GetExitCodeThread() returns the termination status of a thread.
The SwitchToThread() function causes the calling thread to yield execution to another thread that is ready to run on the current processor. The operating system selects the thread to yield to. This can be implemented with Sleep() also. Sleep(0) causes the thread to relinquish the remainder of its time slice to any other thread of equal priority that is ready to run. If there are no other threads of equal priority ready to run, the function returns immediately, and the thread continues execution.
TerminateThread() can result in the following problems (ExitThread() is preferred):
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread’s process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
Thread Local Storage (TLS): Thread local storage (TLS) enables multiple threads of the same process to use an index allocated by the TlsAlloc function to store and retrieve a value that is local to the thread. In this example, an index is allocated when the process starts. When each thread starts, it allocates a block of dynamic memory and stores a pointer to this memory in the TLS slot using the TlsSetValue function. The CommonFunc function uses the TlsGetValue function to access the data associated with the index that is local to the calling thread. Before each thread terminates, it releases its dynamic memory. Before the process terminates, it calls TlsFree to release the index. This is quite interesting concept worth to describe more here. Here’s how the APIs work:
First define a structure that contains all the data that needs to be unique among the threads. For example,
int a ;
int b ;
DATA, * PDATA ;
The primary thread calls TlsAlloc to obtain an index value:
dwTlsIndex = TlsAlloc () ;
This index value can be stored in a global variable or passed to the Thread function in the argument structure.
The Thread function begins by allocating memory for the data structure and calling TlsSetValue using the index obtained above:
TlsSetValue (dwTlsIndex, GlobalAlloc (GPTR, sizeof (DATA)) ;
This associates a pointer with a particular thread and a particular thread index. Now any function that needs to use this pointer, including the original Thread function itself, can include code like so:
PDATA pdata ;
pdata = (PDATA) TlsGetValue (dwTlsIndex) ;
Now it can set or use pdata->a and pdata->b. Before the Thread function terminates, it frees the allocated memory:
GlobalFree (TlsGetValue (dwTlsIndex)) ;
When all the threads using this data have terminated, the primary thread frees the index:
TlsFree (dwTlsIndex) ;
The threads share several variables or a data structure. Often, these multiple variables or the fields of the structure must be consistent among themselves. The operating system could interrupt a thread in the middle of updating these variables. The thread that uses these variables would then be dealing with inconsistent data. The result is a collision, and it’s not difficult to imagine how an error like this could crash the program. What we need are the programming equivalents of traffic lights to help coordinate and synchronize the thread traffic. That’s the critical section. Basically, a critical section is a block of code that should not be interrupted.
Declaring: CRITICAL_SECTION cs ;
Initializing: InitializeCriticalSection (&cs) ;
Entering: EnterCriticalSection (&cs) ;
Leaving: LeaveCriticalSection (&cs) ;
Deleting: DeleteCriticalSection (&cs) ;
The most common use of multiple threads of execution is for programs that find they must carry out some lengthy processing. We can call this a “big job,” which is anything a program has to do that might violate the 1/10-second rule. Obvious big jobs include a spelling check in a word processing program, a file sort or indexing in a database program, a spreadsheet recalculation, printing, and even complex drawing. Of course, as we know by now, the best solution to following the 1/10-second rule is to farm out big jobs to secondary threads of execution. These secondary threads do not create windows, and hence they are not bound by the 1/10-second rule.
Create: hEvent = CreateEvent (&sa, fManual, fInitial, pszName) ;
Set the Event state as signalled: SetEvent (hEvent) ;
Set the Event state as unsignalled: ResetEvent (hEvent) ;
Waiting for the signalled state: WaitForSingleObject (hEvent, dwTimeOut) ;
Waiting for the multiple signalled state: WaitForMultipleObject (…) ;