Byte arrays (byte[]) are preferred over strings for handling passwords and other sensitive data in memory over string object types. In .Net, the cryptographic public abstract class System.Security.Cryptography.HashAlgorithm, from which most encryption methods are derived, can only take byte[] or Stream as inputs.
public static partial class ExtensionMethods
{
/// <summary>
/// Calculates the MD5 hash from a string
/// </summary>
/// <remarks>
/// Use of string is unsafe - use byte[] instead
/// </remarks>
/// <param name="input">The input string.</param>
/// <returns>A md5 string or null</returns>
public static string CalculateMd5Hash(this string input)
{
// create a new string
StringBuilder sb = new StringBuilder();
if (input == null) return null;
// create a new HashAlgorithm
using(HashAlgorithm md5 = MD5.Create())
{
// convert to bytes in order to compute the hash
byte[] inputBytes = Encoding.ASCII.GetBytes(input);
// hash the bytes
byte[] hash = md5.ComputeHash(inputBytes);
// append the string
foreach (byte b in hash)
{
sb.Append(b.ToString("X2"));
}
}
return sb.ToString();
}
}
The reason for this is because in memory, strings are not the most secure way of storing passwords. Strings, which are immutable. Once you've created the String, if a process that can dump memory could see the value if the garbage collection has not destroyed the string.
Since garbage collection is not guaranteed, storing sensitive data as passwords could present a security risk.
An array can be explicitly wiped after use. You can overwrite the array with anything such that the sensitive data won't be present in memory, even before garbage collection. Be careful that the byte array does not prevent serialization if serialization is required.
public static partial class ExtensionMethods
{
/// <summary>
/// Clear a general array
/// </summary>
/// <typeparam name="T">The array object</typeparam>
/// <param name="b">The array to overwrite</param>
public static void Clear<T>(this T[] b)
{
for (int n = 0; n < b.Length; n++)
{
b[n] = default(T);
}
}
/// <summary>
/// Clear an array of bytes but let it be serializable
/// </summary>
/// <param name="b">The byte array to overwrite</param>
public static void Clear(this byte[] b)
{
for (int n = 0; n < b.Length; n++)
{
b[n] = 48; // 48 is the equivalent of '0'
}
}
/// <summary>
/// Calculates the MD5 hash from a byte array
/// </summary>
/// <remarks>
/// This is a more secure way of handling password data
/// </remarks>
/// <param name="input">The input byte array.</param>
/// <returns>A md5 byte array</returns>
public static byte[] CalculateMd5HashSecure(this byte[] input)
{
byte[] returnValue = new byte[] {};
if (input == null) return returnValue;
// create a new HashAlgorithm
using (HashAlgorithm md5 = MD5.Create())
{
// hash the bytes
returnValue = md5.ComputeHash(input);
// clear the input
input.Clear();
}
return returnValue;
}
}
Using a byte array only reduces the window of opportunity for an attacker - and only for a specific memory attack. Copies of even a char[] might be left in memory and never be cleared until that memory is reused physically - not procedurally.
A portential advantage is that a developer is less likely to log the clear text password - as byte arrays are base 64 encoded by default by XmlSerializer.
As an alternate approach, .NET has a SecureString class, but uses a character array. Fortunately char[] can be converted to byte[] using Encoding.UTF8.GetBytes(chars);