C++ - Encoding data to hex and back

29. March 2012 20:29

 

Sometime it can be a problem when working with network protocols or any sort of client / server communications that use text based protocol it can be a problem to transmit certain types of data which contain quotes or other types of charaters. It also has a simalar problem for storing data in ini files.

 

There is quite a simple solution to this. Encode the data some how to send the data encoded and decode it at the other end. It would probably be better to use the well known base64 encoding but if you don't have an implementation of it avilable the following can work just as well.

 

The encoder

 

std::string HexEncode(std::string str) {
	std::string tmp;
	const char *c = str.c_str();
	char buf[3];
	
	while(*c != 0) {
		sprintf(buf, "%02X", (unsigned char) *c);
		tmp += buf;
		c++;
	}

	return tmp;
}

 

 

The decoder

 

 

std::string HexDecode(std::string str) {
	std::string tmp;
	const char *c = str.c_str();
	unsigned int x;
	
	while(*c != 0) {
		sscanf(c, "%2X", &x);
		tmp += x;
		c += 2;
	}
	
	return tmp;
}

 

The only problem with the above is that it does double the size of the data once it is encoded but since most other encoding like base64 or uuencoding.

E-mail Kick it! DZone it! del.icio.us Permalink


Comments (4) -

5/10/2012 11:43:52 PM #

HexEncode() contains two!! bugs

The following construction is weird:
char buf[3];
sprintf(buf, "%X", *c);

1) char is usually signed. So chars from the second part of ASCII table will be converted into values like 0xFFFFFF8E. Such value will overflow stack buffer and the result hex string addition ("FFFFFF8E") will be incorrect.
Improved code: sprintf(buf, "%X", (unsigned char)*c)
But it is better to increase buffer too.

2) TAB char has a value less than 10, so only one hex char will be generated.
Improved code:
char buf[20];
sprintf(buf, "%02X", (unsigned char)*c)

HexDecode() has potential bug (it will be a real bug if you will use HexEncode() before)

Using "c += 2;" we can loose end of string if string contains odd number of chars.
Improved code:
if(!c[1]) break;
c += 2;

Sergey K Belarus |

5/11/2012 8:50:59 AM #


Thanks for pointing that out. I have updated the post with the changes

james United Kingdom |

5/12/2012 1:19:30 PM #

"%2X" will insert " 8" for TAB character.
You should use "%02X"

Sergey K Belarus |

7/10/2012 12:53:47 AM #

One more bug... We want to hex-encode to safely encapsulate binary values... and ZERO is a perfectly valid binary value.  But if your code hex-encode a binary string with zero-value bytes ... truncated string!   It will be slower but safer to replace the "while (*c != 0)" loop with a for loop with "for (int i=0; i<str.length(); i++, c+= 2)"

Fred United States |