Category Archives: code

ASCII Delimited Text: Why the “Right” Answer Isn’t Very Useful

To overcome some issues with the SQL Server bulk import/export process, I went shopping for a an unquoted, delimited format this morning.  There’s a very old, very simple solution that would have worked fine in my specific circumstance.  Here’s why I’m not going to use it:

The problem I had was that I need to export using the SQL Server bcp utility and re-import using BULK INSERT.  While you can specify the separators with these tools, they don’t handle quoted fields at all, i.e. quoting, if present, doesn’t actually protect the separators should they appear in data.  Per the documentation:

If a terminator character occurs within the data, it is interpreted as a terminator, not as data, and the data after that character is interpreted as belonging to the next field or record. Therefore, choose your terminators carefully to make sure that they never appear in your data.

This was a problem, because my data actually could contain tabs, carriage returns and line feeds, which were the default separators.  It also definitely contained commas.  That puts all the common delimiters out of commission.  “Choose your terminators carefully” is pretty vague, but I figured why not give it a shot?

The approach I almost took was to use the ASCII “record separator” and “unit separator” (ASCII 30 and 31, HEX 0x1E and 0x1F respectively).   Here’s a blog post that nicely sums up the format and why it’s historically and logically valid.

Though it’s not well-documented (somewhat contra-documented), SQL Server supports this.   Even though they suggest that only “printable” characters and a small set of control characters are valid, I had no problem exporting an ASCII-delimited text file using a command like this:

bcp SELECT "x,y FROM t" queryout "c:\filename.asc" -c -r 0x1e -t 0x1f -UTF8 -T -Slocalhost"

I didn’t get as far as trying the BULK INSERT on the other end, though, and here’s why…

Once I had that “ASCII delimited” file, I opened it in NotePad++ to verify that the format was readable and correct.  It was, but the effect wasn’t pretty.  I immediately realized that if I wanted to do anything else with this data–extract a column as text, import into Excel–I was going to have problems.  Excel’s text import wizard, for example, doesn’t support control characters other than tab.  This wasn’t really news to me as I see weird control characters in ASCII all the time working with magnetic stripes, bar codes, RFID, and other encoding and storage mechanisms.  Yes, you can eventually get at the data to a usable form with search and replace, or worst-case regular expressions, but why make it hard to manage if you don’t have to?

In my case, the whitespace control characters in the data improved readability but weren’t functionally required–the data payload itself was code.  Plus, I had comment blocks available as a “quote-like” protection structure.  So, I ended up compromising on replacing the whitespace control characters in a such a way that I can get them back if I want to, or leave them alone for the automated process.  What I ended up doing was this:

bcp SELECT "x,REPLACE(REPLACE(REPLACE(y,CHAR(13),'/*CR*/'),CHAR(10),'/*LF*/'),CHAR(9),'/*TAB*/') FROM t" queryout "c:\filename.tsv" -c -UTF8 -T -Slocalhost"

That produces a “normal” tab-separated file with CRLF as the record separator.  I knew that “x” couldn’t contain those characters, so by replacing them out of “y” I have a file that safely exports and imports while being viewable, editable and importable using normal tools without excessive processing.

I wish we had kept better support for ASCII control characters in our tools as we moved forward with new technologies–it would have been useful to have distinct separators that never appeared in data (until, inevitably, your data payloads themselves contained ASCII-delimited data, at which point you’re back to quoting or replacing or choosing new separators… rinse… repeat).  Of course another solution would have been making the SQL Server utilities just slightly more robust so they could process modern quoted .csv and .tsv formats.  There’s always version.next, right?

Invalid FORMATETC structure (Exception from HRESULT: 0x80040064 (DV_E_FORMATETC)) on Simple Windows Forms Drag-and-Drop Implementation

When creating a Windows Forms application in Visual Studio and implementing drag-and-drop you may encounter this exception during debugging if you drag outside your own application, even though the drag and drop operations complete successfully inside your application.

This may not be a bug, or even an error!  Things to try:

  1. Run the application executable directly from the debug folder, not in Visual Studio debug mode.
  2. Run in VS debug mode with your application directly over a window that can accept its drag content type.  Perform the drag directly from your application to the valid drop zone on the other application, passing over no other applications (including the desktop) or invalid drop zones.  For example, if you are using DataObject.SetText, place your application directly over the text area of a text editor that accepts dragging, e.g. WordPad.

In those two conditions, you probably won’t see the exception thrown and the drag-and-drop will succeed.

I think what’s happening here is that Visual Studio is being a little too aggressive in watching the Windows event messaging system.  Because of the way dragging works, these “first chance” COM exceptions will occur as the dragging mouse passes over targets that cannot accept the content stored in the DataObject.  If they have any drag-drop awareness, they will attempt a COM native “get” operation on the FORMATETC structure created when you initiated the drag in your application (you used the DataObject wrapper and injected it with DoDragDrop, but this is what you did in effect).  If the format doesn’t match any of the formats the dragged-over applications (including Windows Explorer, a.k.a. the desktop) accept, this exception is thrown.  It is then, typically, handled either by that application or Windows itself, for example by switching to the “um, not so much” cursor (circle with line through it).  Running outside of debug mode, or between two application that agree on format, it “just works” (dropping successfully in places it can, blocking the drop in places it can’t).  In debug mode, VS is telling you, “hey, look, an exception, you could handle this if you wanted to,” but in most cases you’ll just let the OS or other applications handle it.

Bottom line: don’t sweat this one too much if your application runs okay outside the debugger and in the controlled debugging conditions described above.

Here’s an article that is the closest thing I could find to an official explanation.

Testing a .NET .asmx Web Service from LINQPad

Great post here about using the Visual Studio Command Prompt tools to pull the WSDL from a .NET .asmx (traditional, non-WCF) web service and compile a standalone class library which encapsulates the interface.  Basically, run a Visual Studio prompt, switch to a writable directory and run these two commands:

Convert WSDL to service reference .cs classes:

wsdl http://z.com/MyService.asmx

That will create classes in the global namespace and name class file based on the service URL.

Alternately, you can add a namespace and control the name of the output file:

wsdl http://z.com/MyService.asmx /n:MyServiceNamespace /o:MyService.cs

Compile into a class library:

csc /t:library MyService.cs

Then in LINQPad add a reference to the new .dll and the System.Web.Services .NET library.

Then you can write this in LinqPAD:

// create an instance of the service 
var service = new MyService();

// Or namespace version
var service2 = new MyServiceNamespace.MyService();
// invoke a web method and dump the results
service.MyOperation().Dump();

 

Visual Studio 2010 SP1 Slow to Start with “Loading toolbox content” Status

I still install VS2010 on all new machines for a number of reasons.  It seems that inevitably over the course of the life of the install I will eventually run into a problem where Visual Studio becomes slow to start, getting “stuck” for many seconds with this message in the status bar:

Loading toolbox content from package Microsoft.VisualStudio.IDE.Toolbox.ControlInstaller.ToolboxInstallerPackage ‘{2C98B35-07DA-45F1-96A3-BE55D91C8D7A}’

I initially theorized this had something to do with the Telerik control suite updating (via its outside “Control Panel” installer), but even completely uninstalling that didn’t solve it.  Since I keep doing a Google search to remember the real solution, I wanted to permalink the answer.  Basically:

  1. Close all instances of Visual Studio
  2. Back up registry (regedit, File|Export, Export Range: All)
  3. Delete registry key: [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\VisualStudio\10.0\Packages\{2c298b35-07da-45f1-96a3-be55d91c8d7a}] (I also back up the specific key I’m deleting before I do it, just in case I click the wrong one)
  4. Navigate to C:\Users\WindowsUserAccount\AppData\Local\Microsoft\VisualStudio\10.0\
  5. Move toolbox*.tbd (typically four files) out of this folder and to a backup location:
  6. Restart Visual Studio

The toolbox*.tbd files will be immediately recreated on launch, probably at a much smaller size.  After the first launch, during which the toolbox is being rebuilt, Visual Studio should start much more quickly.

 

SQL Server Upgrade – Transfering User Names and Passwords

When upgrading to a new version of SQL Server, moving the databases themselves is pretty straightforward–you can use either backup and restore, or detach and attach.  However, you’re likely to end up with orphaned users if you do this.  Recreating the users manually is tedious (especially when you consider tracking down and/or changing all the passwords) and may cause SID conflicts.  Fortunately, Microsoft provides a helper script in this knowledgebase article:

How to transfer the logins and the passwords between instances of SQL Server 2005 and SQL Server 2008

Some things to be aware of:

  • You need to review and edit the output of this script as it will capture all logins, including some you likely don’t want to recreate (built-in accounts, Windows groups, plus “real” accounts that may be dead wood).
  • Even though the page specifically says

    “The steps in this article do not transfer the default database information for a particular login. This is because the default database may not always exist on server B. To define the default database for a login, use the ALTER LOGIN statement by passing in the login name and the default database as arguments.”

    the resulting CREATE LOGIN scripts do include the DEFAULT DATABASE parameter.  As they indicate above, this can cause problems if the database isn’t present.  I set most of mine back to “master.”

I run this, selectively, before restoring target databases and everything “just works.”  Used judiciously, it’s a real time-saver.

How to Show Removed Devices in Device Manager

Admit it, sometimes you uninstall hardware without removing the driver first. Sometimes the hardware just dies and Windows makes it “go away.” The problem is, the driver and configuration are still there, and if you were using the default Microsoft drivers, there’s no visible way to uninstall those.

Here’s a workaround:

  1. Obtain an administrative command prompt (Start|All Programs|Accessories, right click on Command Prompt and click Run as Administrator)
  2. At the command prompt, type or paste:
    set devmgr_show_nonpresent_devices=1
  3. Then type:
    devmgmt.msc
  4. This will pop up a normal-looking Device Manager, in the menu of which click View|Show Hidden Devices
  5. Expand the relevant part of the tree and (with care) uninstall away!

How to Connect Process ID to Application Pool in IIS

Under IIS6, you used to be able to run this .vbs script at the command line to list all the running app pools and view their Proc Ids:

c:\windows\system32\iisapp.vbs

That script isn’t shipped with IIS7, and it wouldn’t run anyway without modification and the “IIS6 Management Compatibility” installed.  Instead, you can use appcmd.exe to obtain similar information using this command line:

C:\Windows\System32\inetsrv\appcmd list wp

Note, you will need to run the command line as administrator and be in one of the directories where appcmd.exe esists.  I used the 32-bit example here, but on my 64-bit Windows 7 and Server 2008 machines, appcmd.exe exists in both of these directories and produces the same results:

C:\Windows\System32\inetsrv

C:\Windows\SysWOW64\inetsrv

Gmail “Safe Mode” – Log In with Labs Disabled

This morning, on first attempting to connect to gmail using  Chrome 4.0.223.16, I began experiencing this message:

You have been signed out of this account.

This may have happened automatically because another user signed in from the same browser. To continue using this account, you will need to sign in again. This is done to protect your account and ensure the privacy of information.

Upon clicking okay, I was immediately logged out of my account. Clearing cache worked to log in once, but the problem recurred upon subsequent log out-log in cycles. Per this thread at Google support, the answer was to start gmail in “safe mode” with labs disabled using this url:

http://mail.google.com/mail/?labs=0

From there, I went into settings and turned off the “Super Stars” lab, though other users have reported the problem from other labs as well.

Update: The problem recurred, even with “Super Stars” turned off, so I disabled all Labs.