Tuesday, October 1, 2013

VBA: Unicode Strings and the Windows API

Microsoft Access In this post, we talk about Unicode, about Strings, about how they sometimes become unexpectedly mangled when you pass them around the system and what we can do about it.
We’ll see some Chinese ideograms, peek into memory and display some nonsense on screen in the process.

VBA likes Unicode

Internally, VBA handles strings in Unicode encoded as UTF-16. This means that each ‘character’ of the string uses at least a 16 bit WORD (2 bytes) of memory (not entirely true, but let’s keep it simple).

Below is the proof; it shows the memory content of a VBA string variable that contains the following sentence:

    Chinese or the Sinitic language(s)
    (汉语/漢語, pinyin: Hànyǔ; 中文, pinyin: Zhōngwén)  
    is the language spoken by the Han Chinese

If you see squares in place of the Chinese ideograms, then you are probably using an old version of Internet Explorer and do not have the Asian fonts installed: simply use the latest versions of Firefox or Chrome if you want to display them correctly (although you can still go through this post without seeing the exact glyphs).

In memory, the layout of this VBA string is:

Memory map of the VBA String

Each character of our string occupies 2 bytes: our first letter C whose hexadecimal code 0043 is located at address 0EED7974, then, 2 bytes later at address 0EED7976, we have our second character, h whose code is 0068 and so on.
Note that Windows is a Little Endian architecture, so the lowest significant byte 68 appears first in memory, followed by the most significant byte 00 at the next address.

On the right side, the map displays strange characters in place of the Chinese ones because the memory map interprets each byte as a simple ASCII character, but you can see for instance that the ideogram is located at address 0EED79EA and that its code is 4E2D .

The character codes we see here are simply the ones that would be returned by the AscW() function in VBA (more or less).

VBA doesn’t always like Unicode

The thing though, is that VBA considers the outside world to be ANSI, where characters take (generally) 1 byte and strings must be interpreted according to a Code Page that translate the character code into a different visual representation depending on the encoding.

This means that if you are Greek, your system’s code page will be 737 and you will see Ω (the capital letter Omega) if a string contains the hexadecimal code 97 whereas on a system set for western languages, including English, you will see ù: different representations of the exact same character code.

This nightmare of ANSI encoding made it very hard to pass around strings if you did not know which code page was associated with them. Even worse, this makes it very hard to display strings in multiple languages within the same application.

The Office VBE IDE editor is one such old beast: it only support the input and display of characters in your current system’s Code Page and it will simply display garbage or ? placeholders for any characters it can’t understand.

Fortunately, we’re not really living in an ANSI world any more. Since the days of Windows NT and Windows XP SP2, we’ve been able to use Unicode where each possible character, glyph, symbol has its own code point (things can be quite complicated in Unicode world as well, but that’s the topic for another post).

Unfortunately, VBA still inherits some old habits that just refuse to die: there are times when it will convert your nice Unicode strings to ANSI without telling you.

To make the situation worse, most of the VBA code on the Internet was written by English speakers at a time when Unicode was just being implemented (around 2000).
The result is that even Microsoft’s own examples and Knowledge Base articles still get things wrong and copying these examples blindly will probably make you, your customers and users very unhappy at some point.

To understand this unwanted legacy, and the solutions to avoid these problems, we’ll go through a concrete example.

Message in a box

Let’s study a simple case: display our previous message using the MessageBox Windows API.
The result we want to achieve is this:

Text containing Chinese properly displayed in a message box

Note that in order to display Chinese ideograms, your system must have the proper fonts installed. If not, you’ll end up with little boxes where the glyphs should be displayed.

What we are talking about here is not limited to Asian languages though: it concerns all languages, including English if you ever include any symbols that’s outside of extended ASCII, like the Euro €, mathematical symbols, words in other alphabets like Greek, phonetics, accents, even Emoji (emoticon) symbols.

First attempt: MessageBoxA

OK, let’s go through the motions and try to use the MessageBox API.

Let’s first make a proper declaration that will work with all versions of Office:

    #If VBA7 Then
        Public Declare PtrSafe Function MessageBoxA Lib "user32" _
                                        (ByVal hwnd As LongPtr, _
                                         ByVal lpText As String, _
                                         ByVal lpCaption As String, _
                                         ByVal wType As Long) As Long
    #Else
        Public Declare Function MessageBoxA Lib "user32" _
                                (ByVal hwnd As Long, _
                                 ByVal lpText As String, _
                                 ByVal lpCaption As String, _
                                 ByVal wType As Long) As Long
    #End If

Now, let’s try to display the message box with our text.

    Dim s As String
    s = GetUnicodeString()
    MessageBoxA 0, s, "test", 0

The GetUnicodeString() function simply returns a string containing the text we want to display. It uses a function UnEscStr() from a previous article to convert escaped Unicode sequences into actual characters.

What we obtain is not what we expected:

MessageBoxA does not display the Chinese characters in our string

All Chinese characters have been replaced by ? placeholders.

Ok, so, if we look again at the signature of the function, we’re calling MessageBoxA. What does this A tacked at the end means?
The Windows API has generally 2 version of functions that handle strings: the old ANSI version, ending with a A, and the Unicode version, ending with a W, for Wide-Characters.

Our first tip is: whenever you see a WinAPI declaration for a function ending in A, you should consider the W version instead.
Avoid A WinAPI functions like the plague!

Get it right: MessageBoxW

So, now we know that we must use MessageBoxW for Unicode, the API documentation says so as well, so we must be on the right track.

Let’s declare the right API function then:

    #If VBA7 Then
        Public Declare PtrSafe Function MessageBoxW Lib "user32" _
                                        (ByVal hwnd As LongPtr, _
                                         ByVal lpText As String, _
                                         ByVal lpCaption As String, _
                                         ByVal wType As Long) As Long
    #Else
        Public Declare Function MessageBoxW Lib "user32" _
                        (ByVal hwnd As Long, _
                                     ByVal lpText As String, _
                                     ByVal lpCaption As String, _
                                     ByVal wType As Long) As Long
    #End If

The only change is the W in the function name and we’ll call the Unicode version. Let’s try to open that box again:

    Dim s As String
    s = GetUnicodeString()
    MessageBoxW 0, s, "test", 0

And this displays:

Oh dear.
Looks like Chinese, but it certainly isn’t what we expected. Here we’ve just got Mojibake, garbage due to incorrect processing of Unicode.

What could be wrong? We did pass a proper Unicode string, so what happened?

Well, VBA happened: whenever you pass a String as a parameter in a Declare statement, VBA will convert it to ANSI automatically.

Let me repeat this: when you see a Declare statement that uses As String parameters, VBA will try to convert the string to ANSI and will likely irremediably damage the content of your string.

I’ll show you what happens by going back to our memory map. This is the content of memory before we call MessageBoxW:

String in memory with Unicode characters intact

This is the memory of the same string after the call to MessageBoxW:

String in memory with Unicode characters converted to placeholders

The string has been converted to ANSI, then converted back to Unicode to fit into 2-byte per character but in the process all high Unicode code points have been replaced by ? (whose code is 003F).

Note that the memory location of the string before and after the call are different. This is normal since the string was converted, so another one was allocated (sometimes they end-up occupying the same memory though it’s probably an optimisation of the interpreter).

In short:

    Dim s As String
    s = ChrW(&H4E2D)   
    ' s now contains the 中 character
    MessageBoxW 0, s, "test", 0
    ' s now contains the ? character instead

So our string was actually returned modified to us; we can’t even rely on VBA to keep our original string intact…

Get it right, use pointers!

So, it’s clear now, whenever VBA sees that you are passing a String in a Declare statement, it will silently convert that String to ANSI and mangle any characters that doesn’t work in the current Code Page set for the system.

To avoid that unwanted conversion, instead of declaring the parameters as String, we will declare pointers instead so that our final declaration will be:

    #If VBA7 Then
        Public Declare PtrSafe Function MessageBoxU Lib "user32" Alias "MessageBoxW" _
                                        (ByVal hwnd As LongPtr, _
                                         ByVal lpText As LongPtr, _
                                         ByVal lpCaption As LongPtr, _
                                         ByVal wType As Long) As Long
    #Else
        Public Declare Function MessageBoxU Lib "user32" Alias "MessageBoxW" _
                                (ByVal hwnd As Long, _
                                 ByVal lpText As Long, _
                                 ByVal lpCaption As Long, _
                                 ByVal wType As Long) As Long
    #End If

What we did here is simply define MessageBoxU as an alias for MessageBoxW which we know is the right API function.
Then we replaced the declaration for ByVal lpText As String and ByVal lpCaption As String, by pointers values: ByVal lpText As LongPtr and ByVal lpCaption As LongPtr.

From the point of view of the MessageBoxW function, in both instances it will receive a pointer, except that instead of a String, itwill receive the value of a LongPtr, which is functionally identical.

Now, when using our new MessageBoxU function, we also need to pass the value for the pointers to the strings explicitly:

    Dim s As String
    s = GetUnicodeString()
    MessageBoxU 0, StrPtr(s), StrPtr("test"), 0

The StrPtr() function has been part of VBA for a while now but it is still not documented. It simply return the memory address of the first character of a string as a LongPtr (or Long versions of Office older than 2010).

There are other functions to get the address of a variable: VarPtr() returns the memory address of the given variable while ObjPtr() returns the memory address of an object instance. We’ll have to talk more about those in another post.

So, now, the result:

Text containing Chinese properly displayed in a message box

Hurray! We did it! No unexpected conversion!
If you looked at the memory map, there would be nothing really interesting to see: the string would have stayed the same throughout the call and would be untouched: no conversion, no copy.

Not only did we manage to make call to API functions work, they are also faster because VBA doesn’t have the overhead of these silly Unicode-ANSI-Unicode conversions every time we use the API function.

What about the built-in VBA MsgBox?

VBA has had its own MsgBox function for a while, so why not use it?

    Dim s As String
    s = GetUnicodeString()
    MsgBox s, , "test"

Well, that’s why you should not:

MessageBoxA does not display the Chinese characters in our string

Unfortunately, the VBA function MsgBox this is just a wrapper for the MessageBoxA API call, so no help there.

Parting words

This post has not been about Message boxes, although you will probably find my Enhanced Message Box replacement useful if that is what you are looking for.

The point of all this is simply to remember the following when involving Strings in Win API calls:

  1. Do not use the A version of API calls, always use the W version instead.

  2. This is also relevant for some of the built-in VBA functions: use ChrW$() over Chr() and AscW() over Asc(), both W version handle Unicode characters and are about 30% faster in my tests (see KB145745 for more information).

  3. Check all code involving Strings with some real-world examples containing Unicode characters beyond the usual Extended ASCII set.

  4. VBA will mess-up and convert to ANSI all strings parameters in a Declare statement.

  5. To avoid this mangling, use pointers instead of Strings in your declarations:
    never declare As String in an API Declare statement. Always use ByVal As LongPtr instead and then use StrPtr() in your calls to pass the pointer to the actual string data.

  6. Similarly, when passing a User-Defined-Type structure that contains Strings to a Declare statement parameter, declare the parameter as ByVal As LongPtr and pass the UDT pointer with VarPtr().

  7. Remember that the VBE IDE cannot process and display Unicode, so don’t expect your debug.print to display anything other than ? for high code point characters.

6 comments   |   Filed under :  MSAccess,Programming,VBA

Thursday, September 12, 2013

Access: using .Net strings in VBA for fun and profit

Microsoft AccessWell, maybe the profit part is a bit of an exaggeration, but it will be fun, I promise, unless you’re not a geek in which case I really wonder why you are reading this blog.

I’m not going to shock you if I say that strings manipulations and formatting are really ugly in VBA.
Who has time to read and debug stuff like this?

    str = "Message to """ & recipientName & """" & vbCrLf
    str = str & "Line " & lineNum & " of " & lineCount & ", " & vbCrLf & _
          "Love you " & recipientName & "... " & ChrW$(10083)
    Debug.? str

Wouldn’t it be better just ot be able to write this instead ?

    str "Message to ""{0}""\n"
    str "Line {1} of {2},\n"
    str "Love you {0}... \u2763"
    Debug.? str.ToString(recipientName, lineNum, lineCount)

or, something more compact like:

    Debug.? sFormat("Message to ""{0}""\nLine {1} of {2},\nLove you {0}... \u2763", _
                    recipientName, lineNum, lineCount)

How about some simple date and number formatting?

    Debug.? sFormat("You earned ${1:C2} at {0:t} on {0:d}", now(), 2588.687)

To print this:

    You earned $2,588.69 at 17:40 on 10-09-2013

Or some Unicode sequences?

  Debug.? sFormat("Here be dragons \U0001F409 and Euro signs \u20ac")

Well, the dragon will probably not be visible, even in your browser it’s probably not well supported by your installed fonts, but at least you should see the Euro symbol.

.Net strings

It’s no secret that strings in .Net have a lot more capabilities when it comes to building and formatting.
You can do everything in VBA as well, but things tend to be more difficult to read and compose. In .Net, strings can be formatted in lots of different ways, including creating columns of fixed-length text, padding, aligning text to the left or the right, specifying precision for numbers, formatting dates, etc.

Strings can also contain special character literals to insert newlines or tabs for instance and special Unicode escape sequences meant to represent characters that would otherwise be impossible to include in a string by hand.

Character literals and Unicode escape sequences

Before we go deeper into how things works, let’s go through some of the helper functions you will find in the attached code.

The StringUtils.bas module contains the following function that takes a string that may contain any .Net special character literals and Unicode escape sequences and returns a string that will contain the un-escaped version:

    UnEscStr(str As Variant) As String

For instance, to include new lines in our strings, just include the \n literal. Adding a tab is simply a matter of using \t.
Note that if you want a backslash \ in your string, you will have to escape it to \\ otherwise the function will interpret the following characters in a way you may not expect.

You can also encode any Unicode character using its Unicode code-point expressed in hexadecimal (for instance, the code-point U+0041 represents the character A).

VBA uses UTF-16 for its string encoding, meaning that each character in a string is encoded using 16 bits and should therefore be capable of containing any possible glyph in the world.

However, to include Unicode values in strings is difficult in VBA:

  • You can’t just type the character in the string: the IDE is a remnant of the past and uses Extended ASCII only. Anything that cannot be displayed in the IDE will come out as ?.

  • You can use ChrW$() to include Unicode in your strings, but it is quite verbose and ends-up an ugly mess as you have to cut your string in bits to insert Unicode characters:

    message = "Chinese or the Sinitic language(s) (" & ChrW$(&H6C49) & _
              ChrW$(&H8BED) & "/" & ChrW$(&H6F22) & ChrW$(&H8A9E) & _
              ", pinyin: Hàny" & ChrW$(&H1D4) & "; " & ChrW$(&H4E2D) & _
              ChrW$(&H6587) & ", pinyin: Zh" & ChrW$(&H14D) & _
              "ngwén) is the language spoken by the Han Chinese"
    

    As a more elegant solution, you can use Unicode escape sequences directly in the strings and they will be converted through the UnEscStr() function.

To properly include a Unicode code-point in your string, you must follow some rules:

  • If the code-point is between U+0 and U+FFFF, use the \uXXXX sequence (lower case u followed by 4 hex digits).
    For instance to encode the € sign whose code-point is U+20AC, just include \u20ac in your string.

  • If the code-point is above 0+FFFF, you must use the \UXXXXXXXX (upper case U followed by 8 hex digits).
    For instance, to encode the dragon symbol whose code-point is U+1F409, you will have to use \U0001F409 in your string.

  • There is a third way to include a code-point sequence by using the \x notation.
    This is similar to \u but you do not have to use exactly 4 hex digits, you can use 1, 2, 3 or 4.
    For instance, instead of encoding the letter A as \u0041, you can simply use \x41.
    However, since this notation accepts a variable number of digits, you have to be careful: the string \x27Effect\x27 will not print 'Effect', because E and f are considered hex digits, you will end up with something that will look like )fect' instead.
    Better use the \u encoding, it’s what you’ll need in 99% of the cases anyway.

As an example, we can rewrite the code above a bit more cleanly:

    message = UnEscStr("Chinese or the Sinitic language(s) " & _
                       "(\u6C49\u8BED/\u6F22\u8A9E, pinyin: Hàny\u01D4; " & _
                       "\u4E2D\u6587, pinyin: Zh\u014Dngwén) " & _
                       "is the language spoken by the Han Chinese")

Would show this when displayed in the enhanced dialogbox:

Displaying Unicode Chars

String formatting

The StringUtils.bas module also contains an implementation of the .Net String.Format() method:

    sFormat(frmt As String, ParamArray args() As Variant) As String

In .Net, string formatting uses {x} place-holders inside the string itself. The place-holder {0} refers to the value of the first parameter, {1} to the value of the second, and so on.
Place-holders can also be used to define how the value should be formatted:

    Debug.? sFormat("You earned ${1:C2} at {0:t} on {0:d}", now(), 2588.687)

Since the sFormat() function internally uses UnEscStr(), you can of course include character literals and Unicode escape sequences as well.

    Debug.? sFormat("You earned:\n\t\u20ac{1:N}\nat:\n\t{0:t}\non:\n\t{0:d}", now(), 2588.687)

Will display:

    You earned:
        €2,588.69
    at:
        17:44
    on:
        10-09-2013

You will find examples on the String.Format help page and on all the associated pages that give formatting information for particular data:

The StringBuilder class

The other attached file StringBuilder.cls contains a class that loosely corresponds to the .Net System.Text.StringBuilder class.

That helper class is used to compose strings efficiently. It understands string formats and will also un-escape character literals and Unicode escape sequences.

The class is pretty straightforward to use:

    Dim s As StringBuilder
    Set s = New StringBuilder
    s "SELECT Reference "
    s "FROM StockEvent "
    s "WHERE  ((EventType = {0}) "
    s "        AND (POID = {1}) "
    s "        AND (AvailableQty > {2}));"
    resultString = s.ToString(EventNum, ID, 33.55)

In order to make the class less verbose, the Append() method of the StringBuilder is set as the default method for the class. So the code above is actually exactly the same as:

    Dim s As StringBuilder
    Set s = New StringBuilder
    s.Append "SELECT Reference "
    s.Append "FROM StockEvent "
    s.Append "WHERE  ((EventType = {0}) "
    s.Append "        AND (POID = {1}) "
    s.Append "        AND (AvailableQty > {2}));"
    resultString = s.ToString(EventNum, ID, 33.55)

Having the most common method as a default makes things a bit cleaner, I think.

The thing to remember, is that in our example, s is an instance of the class, it is not a string, it’s an object, so once you have completed your string composition, you must call ToString() to get the actual VBA string back.

The class is fairly flexible, allowing you to insert values in each formatted string or, as shown above, by adding the parameters in the call to ToString():

    s "SELECT Reference "
    s "FROM StockEvent "
    s "WHERE  ((EventType = {0}) ", EventNum
    s "        AND (POID = {0}) ", ID
    s "        AND (AvailableQty > {0}));", 33.55
    resultString = s.ToString()

You can also append numbers, dates, boolean directly to the string as you compose it:

    s "The date : "
    s Now()
    s "I hope it's not false: "
    s True
    s "I owe a lot of money: "
    s 12.55

When you want to display some text that is going to be printed or inserted in a file or textbox, you can use AppendLine() instead of Append()

    Public Sub ShowHelp()
        Dim s As New StringBuilder
        s.AppendLine "Super Secret Program - v{0}", progVersion
        s.AppendLine "(c) my secret self"
        s.AppendLine "Usage:"
        s.AppendLine " supsec [options] filename"
        Debug.? s.ToString()
    End Sub

Alternatively, you can set the special property AlwaysAppendCrLf to True, in which case, a newline will be added every time you call Append() :

    Public Sub ShowHelp()
        Dim s As New StringBuilder
        s.AlwaysAppendCrLf = True
        s "Super Secret Program - v{0}", progVersion
        s "(c) my secret self"
        s "Usage:"
        s" supsec [options] filename"
        Debug.? s.ToString()
    End Sub

You can also insert a string or numbers, etc., at any place within the string being built:

    s "Line1Line3"
    s.Insert 5, "\nLine{0}\n", 2

Or replace any substring by another one:

    s "Line1\nRow2\nLine3\nRow4"
    s.Replace "Row", "Line"

Or simply remove some characters from the string:

    s "Line1\nRow2\nLine3"
    s.Remove 5, 5

Instead of getting the whole string, you can get an portion of it:

    s.ToSubstring(4, 10) 

Or just return any character at any position in the current string:

    s "Line1\nRow2\nLine3"
    Debug.? s.Chars(6)  ' Will return R
    Debug.? s.Codes(6)  ' Will return 82, which is the ASCII/Unicode for R

If you want to re-use the StringBuilder object, you can simply clear it to remove all strings already added:

    s.Clear

A few more details:

  • All operations on the StringBuilder object return the object instance itself.
    This allows for chaining of operations like this:
        str = s("Line1\n")("Line3\n").Insert(5, "Line2\n").Replace("Line", "Row").ToString()
  • You can find out the current length of the string with the Length property.

  • Be careful that Replace() is case sensitive since all .Net operations are case sensitive.
    So Replace("line", "Row") is not the same as Replace("Line", "Row").

Using .Net strings in VBA

Using .Net from VBA can be a challenge depending on what part of .Net you want to use and how you want to use it from VBA.

The .Net String object itself is not accessible from VBA. You can’t create an instance of it directly from VBA and when you use an existing instance of a .Net string, it gets converted to a VBA string immediately.

However, there is a very useful .Net class that is already registered on you machine: System.Text.StringBuilder.

This .Net class is the workhorse behind the implementation of the StringBuilder class and the sFormat function in VBA.
Thanks to this easily instantiable object, we can use some of the power of .Net in our VBA string at no cost.

Some performance information

Of course, doing all that processing has a cost, although in the vast majority of cases, it should not be noticeable.

  • I timed the use of UnEscStr above in the string that contains Chinese Unicode characters.
    The call takes 0.010ms on my machine (a 4 year old i7)

  • A call to sFormat similar to the example given in this article took about 0.10ms.

This timing depends a lot of course on the complexity of the string, the type of formatting, the amount of literal characters an Unicode sequences to convert, etc.

So if you are formating 10,000s of strings, or un-escaping 100,000s or strings the processing time will probably add up to be noticeable but in most use cases, even if you have a few 1,000s of strings, processing will not impact performance in any noticeable way.

Code Updates

v1.0: 12SEP2013
Original release.

Code download

The 2 files contained in the archive below are all you need. Actually, if you are only interested in the sFormat() or UnEscStr() function you only need to import the StringUtils.bas module.

However, if you want to use the StringBuilder class, you must import both files.

To import the files, just open your Access application, open the IDE then, from the menu, File > Import File, and select the StringUtils.bas, then repeat for StringBuilder.cls.
That’s all there is to it.

Download Download the StringUtils.v1.0.zip (7KB), version 1.0 – 12SEP2013 containing the code files.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
Free for re-use in any application or tutorial providing clear credit is made about the origin of the code and a link to this site is prominently displayed where end-users can easily access it.

1 comment   |   Filed under :  Database,MSAccess,Programming

Sunday, July 7, 2013

VBA for 32 and 64 bit systems

Microsoft Access I’ve put off writing about x64 MS Office because I was a bit lazy and hadn’t build the necessary environment to test it properly.

In my day-to-day job -as far as Access is concerned-, I have so far targeted the 32 bit (x86) version of Office 2010 but the 64 bit version is bound to become more popular with time and some people already have to use it, even if sometimes their reasons are misguided.

When should I care?

If your code has any chance of being run on Office 2010 and later, and you want to avoid issues when one of your users or customers will have Office x64 (or just one of its components) installed, then you should make sure to use your code uses the new VBA7 constructs.

The changes to be performed only affect the parts of your code that use Win API calls or manipulate Windows control handles and memory pointers.

For a small fraction of applications, actually using Access/Excel x64 could make sense as it allows you to deal with 64 bit integers (as opposed to the standard 32 bit long). While this could be a useful feature, code that use these 64 bit integers will not compile on a 32 bit version of Office.

If you use external libraries and controls

If you are using external libraries and ActiveX controls, it’s unlikely that they will only work in Office x64. The reason is that most libraries (especially older ones) are only available in 32 bits.

For these to work under Office x64, they would have to be compiled especially for 64 bit as well.

So, the bottom line is: do not use external libraries and ActiveX components (not even the ones that usually come with Office!) unless you can either:

  • completely control the user environment an ensure it will always run Office x86 (32 bit).
  • you can compile or get 64 bit versions of these libraries and ActiveX components so you can build a version of your application that will work for all your users.

If your code targets Access 2010 and later

There are 2 changes that may affect your code: external library declarations and usage of the new pointer and very large integer types.

Library declarations:

Any declaration for external libraries (such as Win API or code that resides in DLLs), the declaration statement has changed and must use the new PtrSafe keyword to tell the compiler that the declaration should be interpreted in a 64 bit context when relevant. If you omit the keyword, your declarations will always be interpreted as being in a 32 bit context and the results may be random.

For instance, the old declaration for the SetFocus Win API function was:

    Private Declare Function WinAPISetFocus Lib "user32" _
                    Alias "SetFocus" (ByVal hwnd As Long) As Long

Now it is:

    Private Declare PtrSafe Function WinAPISetFocus Lib "user32" _
                    Alias "SetFocus" (ByVal hwnd As LongPtr) As Long

Note how the hwnd parameter is now a LongPtr instead of just a Long.

New types:

You will need to replace declarations that use long with the more appropriate type:

  • LongPtr
    Use the new LongPtr wherever you deal with declarations that relate to Windows control handles or memory locations.

  • LongLong
    Defines a 64 bit integer.
    The reason for LongLong is mainly to access some 64 bit versions of the Windows API, in particular, for math functions.
    This type only exists if you are running under a 64 bit version of Access. On 32 bit systems, it will cause a compile error. To avoid that problem where your code needs to run on both 32/64 bit platforms, you can use the new conditional constant Win64:

        #If WIN64 Then
            Dim largeInt As LongLong  ' Declares a 64 bit integer
        #Else
            Dim largeInt As Long      ' Declares a standard 32 bit integer
        #End If

The compiler will only see and compile one code path and will ignore the other, so doing this does not affect the speed of your application.

If your code must also work in Access 2007 and earlier

Older versions of Access will not be able to compile the new PtrSafe, LongPtr and LongLong keywords introduced by VBA7. If your code must also work in older versions of Access, you must either create 2 versions of your application (not very convenient), or simply use the new conditional constant VBA7 to separate blocks of code that will run on Office 2010 and later from blocks of code that will run on Office 2007 and ealier.

The new VBA7 conditional constant will evaluate to True if it is running under Office 2010 and later.

So now you can create variations of your code that will work everywhere:

    #If VBA7 Then
      Private Declare PtrSafe Function WinAPISetFocus Lib "user32" _
                      Alias "SetFocus" (ByVal hwnd As LongPtr) As Long
    #Else
      Private Declare Function WinAPISetFocus Lib "user32" _
                      Alias "SetFocus" (ByVal hwnd As Long) As Long
    #End If

Adapting old code

When going through old code, you can’t just add PtrSafe to all API declarations and do a search to replace Long into LongPtr. That would be the surest way to get everything wrong.

Instead, you must be very careful about the intent of the code and understand what your variables are manipulating: is that long really just a number or is it mean to be a memory location or a control handle ?

To help you, at least as far as the most common Win API declarations are concerned, Microsoft has made available a text file with hundreds of such declarations compatible for Office x64: Win32API_PtrSafe with 64-bit Support

If you need to use a WinAPI call that isn’t in the list, you will need to have a look at its actual definition and, based on the type of data being passed around, decide if you need to use a simple Long or a LongPtr.

For instance, I had to use the MonitorFromWindow API function. Its C++ definition is as follow:

    HMONITOR MonitorFromWindow(
      _In_  HWND hwnd,
      _In_  DWORD dwFlags
    );

The definition tells me:

  • That the function is taking a Handle (basically a pointer) to a Window and some flags
  • and that the function returns a Handle to the display monitor (basically a pointer).

So the API definition that will work on all versions of Access is:

    #If VBA7 Then
        ' Works in 32/64 bits of Office 2010 and later
        Declare PtrSafe Function MonitorFromWindow Lib "user32" _
                                 (ByVal hwnd As LongPtr, ByVal dwFlags As Long) As LongPtr
    #Else
        ' Works in Office 2007 and earlier
        Declare Function MonitorFromWindow Lib "user32" _
                         (ByVal hwnd As Long, ByVal dwFlags As Long) As Long
    #End If

When using the external function, you also need to be careful in the code and use the VBA7 conditional to separate the parts where you access structure pointers:

    Private Sub Redimension()
        #If VBA7 Then
            Dim hMonitor    As LongPtr
        #Else
            Dim hMonitor    As Long
        #End If
        hMonitor = MonitorFromWindow(Application.hWndAccessApp, &H0&)
        ... do stuff with hMonitor ...
    End Sub

Make sure you save, re-compile and test your project as you go along. Also make frequent backups of your application as a wrong declaration could crash and corrupt it.
It may even be possible to crash the whole machine, although I have not experienced that so far.

Conclusion

Updating old code to make it work properly on all versions of Access is not difficult. It is usually fairly easy, although it can be a bit cumbersome at times.

The trick is to know what your code really needs to target:

  • Office 2010 and later, the present and the future, in which case everything is straightforward and you use the new VBA7 keywords PtrSafe, LongPtr and LongLong where appropriate.
  • All versions of Access, in which case you will have sometimes to separate and maintain 2 versions of your code using the VBA7 conditional.

References

4 comments   |   Filed under :  Database,MSAccess,Programming,VBA

Wednesday, September 7, 2011

Programming podcasts

Podcasts are a great way to get educated and entertained. As a developer we are lucky to have a choice of some fine podcasts from industry leaders. If you commute, jog, travel, you can easily use that time away from the computer to get better informed and reflect about our field.

Continue Reading 1 comment   |   Filed under :  Fun,Musing,Programming

Wednesday, July 13, 2011

Office 2010 SP1 and Access 2010 Runtime SP1 are broken

Microsoft Access (Updated Saturday 10DEC2011.) On 28th of June 2011, Microsoft Office 2010 Service Pack 1 and the Access 2010 Runtime Service Pack 1 were issued.

After upgrading my development machine (Win7 x64) and a few clients (Windows 2008R2 x64) to SP1 (x86), I started to get strange issues:

  • I use .Net libraries from my Access application and suddenly, even when not instantiating any .Net objects, Access would crash, usually on startup, but sometimes when opening the VBE.
    Decompiling and re-compacting the database would be OK, usually once, but the problem would reappear the next time I would restart the application.
  • In the Runtime, I would get strange errors, such as The setting your entered isn’t valid for this property, or Action Failed Error Number: 2950, or Runtime Error 3270 Property not found.
    The strange thing about these errors is that they would occur in places that had not been modified for many releases of our application, parts that have been running without problem until now.
  • Another weird issue was the systematic reset of our custom ribbon to its first tab. this could happen randomly, but most it could also be reproduced by simply opening a report as a tab page (that fill-in the whole MDI window). When closing that form, the first tab of the ribbon would select itself automatically. This wasn’t happening when closing pop-up windows.

After removing the Office and Runtime Service Pack 1, everything went back to normal.

A fix, finally!

A good 6 months after SP1 was released, Microsoft finally issued hotfix for the issue!
So, read about it on the MS Forums and get your KB2596585 hotfix, then decompile your database to clean it out.

12 comments   |   Filed under :  Database,MSAccess,Programming

Sunday, October 3, 2010

Admin: Linux file server performance boost (ext4 version)

LinuxIn the previous article, I showed how to improve the performance of an existing file server by tweaking ext3 and mount settings.
We can also take advantage of the availability of the now stable ext4 file system to further improve our file server performance.

Some distribution, in particular RedHat/CentOS 5, do not allow us to select ext4 as a formatting option during setup of the machine, so you will initially have to use ext3 as file system (on top of LVM preferably for easy extensibility).

A small digression on partitioning

Remember to create separate partitions for your file data: do not mix OS files with data files, they should live on different partitions. In an enterprise environment, a minimal partition configuration for a file server could look like:

Hardware:

  • 2x 160GB HDD for the OS
  • 4x 2TB HDD for the data

The 160GB drives could be used as such:

  • 200MB RAID1 partition over the 2 drives for /boot
  • 2GB RAID1 partition over the 2 drives for swap
  • all remaining space as a RAID1 partition over the 2 drives for /
    Note though that it is generally recommended to create additional partitions to further contain /tmp and /var.

The 2TB drives could be used like this:

  • all space as RAID6 over all drives (gives us 4TB of usable space) for /data
  • alternatively, all space as RAID5 over all drives (gives us 6TB of usable space) The point of using RAID6 is that it gives better redundancy than RAID5, so you can safely add more drives later without increasing the risk of failure of the whole array (which is not true of RAID5).

Moving to ext4

If you are upgrading an existing system, backup first!

Let’s say that your /data partition is an LVM volume under /dev/VolGroup01/LogVol00. First, make sure we have the ext4 tools installed on our machine, then unmount the partition to upgrade:

    # yum -y install e4fsprogs
    # umount /dev/VolGroup01/LogVol00

For a new system, create a large partition on the disk, then format the volume (this will destroy all data on that volume!).

    # mkfs -t ext4 -E stride=32 -m 0 -O extents,uninit_bg,dir_index,filetype,has_journal,sparse_super /dev/VolGroup01/LogVol00
    # tune4fs -o journal_data_writeback /dev/VolGroup01/LogVol00

Note: on a RAID array, use the appropriate -E stride,stripe-width options, for instance, on a RAID5 array using 4 disks and 4k blocks, it could be: -E stride=16,stripe-width=48

For an existing system, upgrading from ext3 to ext4 without damaging existing data is barely more complicated:

    # fsck.ext3 -pf  /dev/VolGroup01/LogVol00
    # tune2fs -O extents,uninit_bg,dir_index,filetype,has_journal,sparse_super /dev/VolGroup01/LogVol00
    # fsck.ext4 -yfD /dev/VolGroup01/LogVol00

We can optionally give our volume a new label to easily reference it later:

    # e4label /dev/VolGroup01/LogVol00 data

Then we need to persist the mount options in /etc/fstab:

    /dev/VolGroup01/LogVol00    /data    ext4    noatime,data=writeback,barrier=0,nobh,errors=remount-ro    0 0

And now we can remount our volume:

    # mount /data

If you upgraded an existing filesystem from etx3, you may want to run the following to ensure the existing files are using extents for file attributes:

    # find /data -xdev -type f -print0 | xargs -0 chattr +e
    # find /data -xdev -type d -print0 | xargs -0 chattr +e

Important notes

The mounting options we use are somewhat a bit risky if your system is not adequately protected by a UPS.
If your system crashes due to a power failure, you are more likely to lose data using these options than using the safer defaults.
At any rate, you must have a proper backup strategy in place to safeguard data, regardless of what could damage them (hardware failure or user error).

  • The barrier=0 option disables Write barriers that enforce proper on-disk ordering of journal commits.
  • The data=writeback and nobh go hand in hand and allow the system to write data even after it has been committed to the journal.
  • The noatime ensures that the access time is not updated when we’re reading data as this is a big performance killer (this one is safe to use in any case).

References

1 comment   |   Filed under :  Linux,sysadmin

Sunday, July 11, 2010

Admin: Linux file server performance boost (ext3 version)

Linux Using a Linux for an office file server is a no-brainer: it’s cheap, you don’t have to worry about unmanageable license costs and it just works.

Default settings of most Linux distributions are however not optimal: they are meant to be as standard compliant and as general as possible so that everything works well enough regardless of what you do.

For a file server hosting large numbers of files, these default settings can become a liability: everything slows down as the number of files creeps up and it makes your once-snappy fileserver as fas as a sleepy sloth.

There are a few things that we can do to ensure we get the most of our server.

Checking our configuration

First, a couple of commands that will help us investigate the current state of our configuration.

  • df will give us a quick overview of the filesystem:

    df -T
    Filesystem    Type   1K-blocks      Used Available Use% Mounted on
    /dev/md2      ext3    19840804   4616780  14199888  25% /
    tmpfs        tmpfs      257580         0    257580   0% /dev/shm
    /dev/md0      ext3      194366     17718    166613  10% /boot
    /dev/md4      ext3     9920532   5409936   3998532  58% /var
    /dev/md3      ext3      194366      7514    176817   5% /tmp
    /dev/md5      ext3    46980272  31061676  13493592  70% /data
    
  • tune2fs will help us configure the options for each ext3 partition. If we want to check what is the current configuration of a given partition, says we want to know the current options for our /data mount:

    # tune2fs -l /dev/md5
    

    If I was using LVM as a Volume manager, I would type something like:

    # tune2fs -l /dev/VolGroup00/LogVol02
    

    This would give lots of information about the partition:

    tune2fs 1.40.2 (12-Jul-2007)
    Filesystem volume name:   <none>
    Last mounted on:          <not available>
    Filesystem UUID:          d6850da8-af6f-4c76-98a5-caac2e10ba30
    Filesystem magic number:  0xEF53
    Filesystem revision #:    1 (dynamic)
    Filesystem features:      has_journal resize_inode dir_index filetype 
                              needs_recovery sparse_super large_file
    Filesystem flags:         signed directory hash
    Default mount options:    user_xattr acl
    Filesystem state:         clean
    Errors behavior:          Continue
    ....
    

    The interesting options are listed under Filesystem features and Default mount options. For instance, here we know that the partition is using a journal and that it is using the dir_index capability, already a performance booster.

  • cat /proc/mounts is useful to know the mounting options for our filesystem (just listed some interesting ones here):

    rootfs / rootfs rw 0 0
    /dev/root / ext3 rw,data=ordered 0 0
    /dev/md0 /boot ext3 rw,data=ordered 0 0
    /dev/md4 /var ext3 rw,data=ordered 0 0
    /dev/md3 /tmp ext3 rw,data=ordered 0 0
    /dev/md5 /data ext3 rw,data=ordered 0 0
    none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
    /dev/md4 /var/named/chroot/var/run/dbus ext3 rw,data=ordered 0 0
    

    The data=ordered mount parameter tells us of the journaling configuration for the partition.

Journaling

So what is journaling?
It’s one of the great improvements of ext3: a journal is a special log on the disk that keeps track of changes about to be made. It ensures that, in case of failure, the filesystem can quickly recover without loss of information.

There are 3 settings for the journalling feature:

  • data=journal the most secure but also slowest option since all data and metadata is written to disk: the whole operation needs to be completed before any other operation can be completed. It’s sort of going to the bank for a deposit, filling the paperwork and making sure the teller puts the money in the vault before you leave.
  • data=ordered is usually the default compromise: you fill-in the paperwork and remind the teller to put the money in the vault asap.
  • data=writeback is the fastest but you can’t be absolutely sure that things will be done in time to prevent any loss if a problem occurs soon after you’ve asked for the data to be written.

In normal circumstances all 3 end-up the same way: data is eventually written to disk and everything is fine.
Now if there is a crash just as the data was written only option journal would guarantee that everything is safe. Option ordered is fairly safe too because the money should be in the vault soon after you left; most systems use this option by default.

If you want to boost your performance and use writeback you should make sure that:

  • you have a good power-supply backup to minimise the risk of power failure
  • you have a good data backup strategy
  • you’re ok with the risk of losing the data that was written right before the crash.

To change the journaling option you simply use tune2fs with the appropriate option:

    # tune2fs -o journal_data_writeback /dev/md5

Mount options

Now that we’ve changed the available options for our partition, we need to tell the system to use them.
Edit /etc/fstab and add data=writeback to the option columns:

    /dev/md5     /data    ext3    defaults,data=writeback   1 2

Next time our partition is mounted, it will use the new option. For that we can either reboot or remount the partition:

    # mount - o remount /data

noatime option

There is another option that can have a very dramatic effect on performance, probably even more than the journaling options above.

By default, whenever you read a file the kernel will update its last access time, meaning that we end up with a write operation for every read!
Since this is required for POSIX compliance, almost all Linux distributions leave this setting alone by default.
For a file server, this can have such drastic consequence on performance.

To disable this time-consuming and not useful feature (for a file server), simply add the noatime option to the fstab mount options:

    /dev/md5     /data    ext3    defaults,noatime,data=writeback   1 2

Note that updating access times is sometimes required by some software, such as mail software (such as mutt). If you properly keep your company data in a dedicated partition, you can enable the mount options only for that partition and keep other options for the root filesystem.

dealing with errors in fstab

After doing the above on one of the servers, I realized that I made a typo when editing /etc/fstab.
This resulted in the root filesystem being mounted read-only, making fstab impossible to edit…

To make matters worse, this machine was a few thousand miles away and could not be accessed physically….

Remounting the root filesystem resulted in errors:

    # mount -o remount,rw /
    mount: / not mounted already, or bad option

After much trial and rebooting, this worked (you need to specify all mounting options, to avoid the wrong defaults from being read from etc/mtab`):

    # mount  -o rw,remount,defaults /dev/md2 /

After that, I could edit /etc/fstab and correct the typo…

Conclusions

How much these options will improve performance really depends on how your data is used: the improvements should be perceptible if your directories are filled with large amounts of small files.
Deletion should also be faster.

1 comment   |   Filed under :  Linux,sysadmin

Wednesday, September 9, 2009

Access: checking blank variables

Microsoft Access I often have to test String, Variant or Object variables that have no content and could be considered ‘blank’.

The problem is that testing for “blankness” can mean many different things to different types:

  • For an Object type, the variable can be Nothing.
  • For a String type, the string can have no content at all: "", vbNullString.
  • For a Variant type, the string can have any of the following attributes or values:
    • it can be Missing if the variable is an unused optional parameter,
    • it can be Empty if it was never assigned,
    • it can be Null if, for instance it’s bound to a nullable field or unbound with no value,
    • it can be an empty string "", vbNullString.

When having to check these variables in code, it can be tiresome to have to go through testing some of these possibilities just to find out that your variable does or not not contains something useful, regardless of the type of variable you are using.

To avoid having to do all these tests, make the code a bit more tidy and allow me to move on to more important things, I use this small utility function quite often:

'-----------------------------------------------------------------------------
' True if the argument is Nothing, Null, Empty, Missing or an empty string .
'-----------------------------------------------------------------------------
Public Function IsBlank(arg As Variant) As Boolean
    Select Case VarType(arg)
        Case vbEmpty
            IsBlank = True
        Case vbNull
            IsBlank = True
        Case vbString
            IsBlank = (LenB(arg) = 0)
        Case vbObject
            IsBlank = (arg Is Nothing)
        Case Else
            IsBlank = IsMissing(arg)
    End Select
End Function

So now I don’t have to worry so much about the type of the variable I’m testing when I want to know if it contains useful data:

...
' Here assume that CustomerReference is a control on a form.
' By using IsBlank() we avoid having to test both for Null and empty string.
If IsBlank(CustomerReference) Then
   MsgBox "Customer Reference cannot be left blank."
End If
...

Obviously, IsBlank() doesn’t replace the other tests but I found it to be more straightforward to use in most cases.

6 comments   |   Filed under :  Database,MSAccess,Programming

Wednesday, August 26, 2009

A story about exceptional service

security01.pngRecently I found myself constrained by the puny 200GB of my Mac Book Pro and I bought a 500GB Seagate drive to replace it (a fast 7200 rpm one).
The Macbook Pro has no easy access for the drive so you have to resort to dismantling the case to access it. This put me off replacing the drive because I would probably be voiding the warranty and was running the risk of damaging this expensive piece of equipment.

I’ve been filling the drive with pictures from my recent camera purchase and I couldn’t put it off any longer, so I bought the new drive and went online to find some good tutorial on how to crack open the Macbook Pro case.

After a few searches, I noticed that many people were referring to the iFixit.com website. It was very easy to find the tutorial I was looking for, I didn’t have to register, and each step was made very clear and simple.
It took no time to open the case and replace the drive.
I was very happy with that find.

Now, that’s not the end of the story.

A couple of days before I replaced the drive the left fan of the laptop suddenly became noisy. This would happen a few times a day, at random, and would last 10-20 minutes.
My only solution to get this repaired was to get to the local Apple service shop. Even though I knew exactly which part number was to be replaced, they still wanted me to:

  • go across town to visit them so they could see for themselves what the problem was: annoying because the problem was intermittent so I may have to go for nothing.
  • wait for the part to arrive a few days later.
  • go back to leave the laptop
  • go again to collect the repaired laptop the next day or so. So all in all: about 6h spend travelling back and forth + no laptop for a couple of day + the risk that some indiscreet technician start looking through my personal stuff.

Instead, I went back to the iFixit website:

  • identified my machine
  • found out the list of spare parts available from their store
  • added the fan to my cart
  • paid for it.
  • found a guide that showed how to replace the part.

That took me all of 10 minutes; I placed my order on Thursday and the next Monday I received the part … halfway across the globe!

I also got a survey request from iFixit and left some comments, from which I got back two nice detailed email follow-ups, one from the CEO saying they were implementing my remarks as part of their site improvement efforts.

Well, I thought I would share this story. It’s not that often that you get excited by an online vendor that not only does its job well but goes beyond expectations.

Add comment   |   Filed under :  Business,Hardware,Reviews,sysadmin

Thursday, June 11, 2009

Access: Run-time Error 3155 ODBC insert on a linked table failed

Microsoft Access I have been spending a lot of time trying to find out why some of the code used to insert new records into a linked SQL Server table would systematically fail with an error:

Run-time Error '3155' ODBC--insert on a linked table  failed

It was driving me mad.
I could insert a simple record using SQL Server Management Studio, I could add new records to the table in datasheet mode within Access, but as soon as I tried to insert a record from code, whether using DAO recordset or executing the same SQL INSERT, it would miserably fail.

After a fair bit of investigation and tests, of which you can read the full account on the question I asked on StackOverflow, it turns out that this is a long-standing bug in the ODBC Driver (or Access).

Memo fields in Access are usually translated into nvarchar(MAX) in SQL Server by tools like SSMA.
Unfortunately, when you link tables having these fields using the SQL Server Client driver, these fields get incorrectly interpreted as string, even though they appear ok from the table design view.
It’s only if you try to insert something into the field, either text larger than 255 chars or NULL, that you get the error message.

So, the solution, at least in this case, is to revert to the older SQL Server ODBC driver instead, or use varchar() instead of nvarchar(), but if you’re dealing with Unicode, you have to stick with nvarchar().

References

4 comments   |   Filed under :  Database,MSAccess,Programming,SQL Server

Previous Posts


about

Renaud This is a simple technical weblog where I dump thoughts and experiences from my computer-related world.
It is mostly focused on software development but I also have wider interests and dabble in architecture, business and system administration.
More About me…

My StackOverflow Profile
My (sporadically active) StackOVerflow account

Most Recent Posts

Categories

Links

Feeds