n l i t e d


Thread Links


📢 PUBLIC Page 1033:3/3 | edit | chip 2018-04-15 13:20:16
Tags: KernelMode

August 22 2017

I have grown to really hate UNICODE_STRING. It is the source of endless problems because it does some very bad things:

  • The text terminator (null) is not required for the text in Buffer[]. The text is sometimes null terminated.
  • The Length and MaximumLength counts are in bytes while the Buffer[] array is in characters.
  • Length and MaximumLength do not include room for a terminator.

This means the text in Buffer[] cannot be used in any C-string functions without first allocating a new buffer to accommodate an extra terminating character, then copying the text just to add the terminator.

Whoever defined UNICODE_STRING saved a byte while wasting lifetimes in man-hours. Everything would be compatible if only UNICODE_STRINGS were required to include a terminator byte.

After being bitten by this too many times, I decided to take a day and write a wrapper class to manage UNICODE_STRING so that Buffer[] would always be a well-terminated C-string.

NtString interface: class NtString { public: NtString(void); NtString(const char *pSrc); NtString(const WCHAR *pSrc); NtString(const ANSI_STRING &Src); NtString(const UNICODE_STRING &Src); NtString(bool IsArgList, const char *Fmt, ...); NtString(bool IsArgList, const WCHAR *Fmt, ...); ~NtString(void); NTSTATUS Set(const WCHAR *pSrc); NTSTATUS Set(const char *pSrc); void Truncate(UINT nChr); const WCHAR *GetText(UINT nChr=0); UNICODE_STRING &GetUnicode(void); //SHOULD BE CONST, but it causes too much trouble. const WCHAR *Print(const char *Fmt, ...); const WCHAR *Print(const WCHAR *Fmt, ...); const WCHAR *PrintV(const char *Fmt, va_list ArgList); const WCHAR *PrintV(const WCHAR *Fmt, va_list ArgList); private: class ntString *pObj; BYTE Obj[64]; };

A couple notes:

  • The NtString can be constructed using either ansi or Unicode text, but it is always stored internally as Unicode.
  • The arguments to Print() are always assumed to be Unicode even if the format text is ansi.
  • The Buffer for an empty NtString (never been appended) points to a static well-terminated 0-length C-string.
  • The contents of the UNICODE_STRING returned by GetUnicode() should never be modified. GetUnicode() should really return a const reference, but that forces the user to inevitably cast away the const attribute in order to use it.
  • Storage for the text is allocated using MemAlloc().
  • The Print() functions use my own formatter, which has some subtle differences from the RtlString... functions.
  • NtString borrows code from IString in the user-mode Support library, so it behaves similarly.


  • The decimal formats can be prefixed with ',' (comma) to decorate the number with commas. Ie: "1,234,567".
  • The 'p' format will print only the lower 32bits of a 64bit pointer unless prefixed with 'h'.
  • No floating point formats in the kernel version. (yet).
  • The 'S' format will print ansi text as Unicode.

close comments Comments are closed.

Comments are moderated. Anonymous comments are not visible to other users until approved. The content of comments remains the intellectual property of the poster. Comments may be removed or reused (but not modified) by this site at any time without notice.

  1. [] ok delete

Page rendered by tikope in 132.589ms