Sending control characters to flat files, AKA “The data send failed unexpectedly”

Not every error in BizTalk is what it pretends to be. Error in the data can break FTP connection while file is being sent.  Recipient thinks he got the file, but half of it is missing. I learned it lately.

The Error

In BizTalk there is a flow routing data from FILE receive location to FTP send port. File is received in XML format and sent as a flat file using flat file assembler in the send pipeline. One cold winter morning BizTalk started suspending messages on the send port with the following error:

The data send failed unexpectedly. Inner Exception details: “(null)”.

image

And in the event log you could read:

A message sent to adapter “FTP” on send port “XmlValidatorTest.Send” with URI “ftp:/computername:21/XmlValidator/%SourceFileName%_%MessageID%.txt” is suspended.
Error details: The data send failed unexpectedly. Inner Exception details: “婫”.
MessageId: {B276AD78-F7C2-4340-8228-A00FD022B6C0}
InstanceID: {46B0D9CA-62CA-43DE-B01E-5F5F1A319C96}

image

And sometimes it would suspend the message saying:

An unexpected failure occurred when processing a message. The text associated with the exception is “Attempted to read or write protected memory. This is often an indication that other memory is corrupt.”.

image

Investigation

At first I thought it could have been FTP server issue, but we had other flows in BizTalk saving flat files on the same FTP server and they were going fine. Administrator checked folder settings on the FTP server and found no issues. FTP send ports in BizTalk were configured correctly. In the logs of FTP server you could find error 426. But no explanation why.

<150 Opening BINARY mode data connection for FILE NAME.CSV.
<426 Connection closed; transfer aborted.
Recipient of the data was contacted and he told us he got the data. Strange. The message was suspended and he still got it. I couldn’t understand. But then he came back to us and said: “I got only part of the file, a lot of data is missing”

XML file is 470KB big and expected size of the flat file is 91KB. Instead of 91 KB of data, only 80 KB was transferred.

image

And then I wondered: “maybe this is not FTP server issue, but data issue?”. Using BizTalk Administration console I saved the XML file on a disk and tried and opened it with Internet Explorer. IE wouldn’t render it properly and that was telling me I was going in the right direction.

image

I validated the file using Visual Studio 2010 and finally got my answer:

, hexadecimal value 0x0B, is an invalid character. Line 19274, position 43

image

Then I removed all 0x0B characters from the XML source file and tested the flow again.
And it worked.

Solution

That one cold winter morning source of the data started sending 0x0B control character in the name field of customer data. Probably the user interface allowed for copy pasting data from Microsoft Excel, where this character is used for separating multiple lines in one cell.

At the end of the day the export procedure at the data source was modified to replace control characters with spaces.

The Findings

Finding 1

If the file you’re sending contains control character somewhere in the beginning, you’re lucky, BizTalk will suspend the file with proper error message and the file will not be transferred at all.

There was a failure executing the send pipeline: “XmlValidatorTest.FlatFileTransmit, XmlValidatorTest, Version=1.0.0.0, Culture=neutral, PublicKeyToken=KLJ60d05fe7YY604” Source: “Flat file assembler” Send Port: “XmlValidatorTest.Send” URI: ftp://computername:21/XmlValidator/%SourceFileName%_%MessageID%.txt Reason: ‘ ‘, hexadecimal value 0x0B, is an invalid character. Line 3, position 225.

image

Finding 2

You might need to restart send port associated host instances when you see the following error:

An unexpected failure occurred when processing a message. The text associated with the exception is “Attempted to read or write protected memory. This is often an indication that other memory is corrupt.”.

During testing I noticed that when this error is raised and you send a good message (without control characters) within 2 minutes later, it also gets suspended with the same error.
But if you send good message 10 minutes later it will go fine. That looks like some garbage (or cache) that takes time to clean up.

Finding 3

It is pretty random whether

“The data send failed unexpectedly. Inner Exception details: “(null)”. “

or

“An unexpected failure occurred when processing a message. The text associated with the exception is “Attempted to read or write protected memory. This is often an indication that other memory is corrupt.”. ”

error is raised.

Finding 4

It does not help to place XML validator pipeline component before Flat File Assembler pipeline component in the Pre-Assemble stage of send pipeline.

image

XmlValidator validates message against it’s schema and it is not checking for presence of control characters.

Finding 5

If node values of XML file contain control characters, a few characters can be sent to flat files without a problem. Out of characters with hex codes from 0x00 to 0x1F range (0 to 31 decimal) only 0x09 (TAB), 0x0A (LF) and 0x0D (CR) are supported. Sending other control characters from that range requires usage of xml escape sequence. This is still not a valid XML according to XML 1.0 specification, but at least the characters can get through to flat files.

image

Below I present you a list of control characters and whether they need to be escaped (M) or not (O) in xml file. I tested each character in the table below.

Decimal code Hex code Xml escape sequence Escape usage
0 00 &#x00; M
1 01 &#x01; M
2 02 &#x02; M
3 03 &#x03; M
4 04 &#x04; M
5 05 &#x05; M
6 06 &#x06; M
7 07 &#x07; M
8 08 &#x08; M
9 09 O
10 0A O
11 0B &#x0B; M
12 0C &#x0C; M
13 0D O
14 0E &#x0E; M
15 0F &#x0F; M
16 10 &#x10; M
17 11 &#x11; M
18 12 &#x12; M
19 13 &#x13; M
20 14 &#x14; M
21 15 &#x15; M
22 16 &#x16; M
23 17 &#x17; M
24 18 &#x18; M
25 19 &#x19; M
26 1A &#x1A; M
27 1B &#x1B; M
28 1C &#x1C; M
29 1D &#x1D; M
30 1E &#x1E; M
31 1F &#x1F; M

Conclusions

There are choices for a long term solution:

  1. Request the data source to deliver xml files without control characters
  2. Implement custom pipeline component that raises an error if control characters are received
  3. Implement custom pipeline component that replaces (removes) unescaped control characters with a configurable value
  4. Implement custom pipeline component that escapes unescaped control characters (i.e. “&#x0B;”)

Whenever control characters are delivered from source, make sure that destination can (and should) handle them. I reckon that most likely they are being sent unwillingly.

Additional resources:
“Character References Explained”,  , Lachy’s blog
”List of XML and HTML character entity references”, Wikipedia

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s