Password Algorithms: Internet Explorer 7, 8, 9

Introduction

IE10 on Windows 8 uses a different algorithm for encryption and storage so I might follow up with separate entry later. For now I’m analysing version 9.0.9 on Windows 7.
Everything here should work fine with legacy IE 7 and 8.

Considering customers may avoid migrating to Windows 8, I thought this protection was still worth covering in detail.

Storage

All autocomplete entries for a user are stored in NTUSER.DAT and they consist of a SHA-1 hash and DPAPI blob. Here’s a dump of some hashes from my own system..

C:\>reg query "HKCU\Software\Microsoft\Internet Explorer\IntelliForms\Storage2"

HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\IntelliForms\Storage2
    6FBD22A243E7F5A0D660199683F52543E80CEB99EC    REG_BINARY    01000000D08C9DDF0115D1118. . .
    DF11F9BE8F0049A2FBFF29C6D49FE77383C2A6783A    REG_BINARY    01000000D08C9DDF0115D1118. . .
    E4CE6B2B79515319A7360D97E3B217F2FC843CC019    REG_BINARY    01000000D08C9DDF0115D1118. . .

The blobs have been truncated to avoid potential offline decryption.
Whenever IE connects to a site which requires login credentials, it will:

  1. Derive SHA-1 checksum of lowercase(URL).
  2. Search for the checksum in autocomplete entries.
  3. If checksum is found, decrypt DPAPI blob using URL and autofill the login fields.

Generation

Take the second hash..

DF11F9BE8F0049A2FBFF29C6D49FE77383C2A678 3A

This is a SHA-1 checksum of the unicode string “https://accounts.google.com/servicelogin”
The last byte 0x3A is a checksum based on addition of each byte in SHA-1 result.
The following function demonstrates this with Windows crypto API

bool GetUrlHash(std::wstring url, std::wstring &result) {

  HCRYPTPROV hProv;
  HCRYPTHASH hHash;
  
  bool bResult = false;
  
  std::transform(url.begin(), url.end(), url.begin(), ::tolower); 
  
  if (CryptAcquireContext(&hProv, NULL, NULL, PROV_RSA_FULL, 
      CRYPT_VERIFYCONTEXT)) {
      
    if (CryptCreateHash(hProv, CALG_SHA1, 0, 0, &hHash)) {
      if (CryptHashData(hHash, (PBYTE)url.c_str(), 
          url.length() * sizeof(wchar_t) + 2, 0)) {

        BYTE bHash[20];
        DWORD dwHashLen = sizeof(bHash);
        
        if ((bResult = CryptGetHashParam(hHash, HP_HASHVAL, bHash, 
            &dwHashLen, 0))) {
            
          BYTE chksum = 0;
          wchar_t ch[4];
          
          for (size_t i = 0;i < 20 + 1;i++) {
            BYTE x;
            
            if (i < 20) {
              x = bHash[i];
              chksum += x;
            } else {
              x = chksum;
            }
            wsprintf(ch, L"%02X", x);
            
            result.push_back(ch[0]);
            result.push_back(ch[1]);
          }
        }
      }
      CryptDestroyHash(hHash);      
    }
    CryptReleaseContext(hProv, 0);
  }
  return bResult;
}

Each username and password is stored in unicode format.
If there’s more than 1 set of credentials for the same URL, these will be added to the existing data.

The problem is that the actual structure for an entry is officially undocumented.
Fortunately, there’s an older revision of the structure online which helps a lot! :)

enum { MAX_STRINGS = 200 };   
enum { INDEX_SIGNATURE=0x4B434957 };
enum { INIT_BUF_SIZE=1024 };
enum { LIST_DATA_PASSWORD = 1 };

struct StringIndex {
  DWORD   dwSignature;
  DWORD   cbStringSize;   // up to not including first StringEntry
  DWORD   dwNumStrings;   // Num of StringEntry present
  INT64   iData;          // Extra data for string list user
  
  struct tagStringEntry {
    union
    {
      DWORD_PTR   dwStringPtr;    // When written to store
      LPWSTR      pwszString;     // When loaded in memory
    };
    FILETIME    ftLastSubmitted;
    DWORD       dwStringLen;        // Length of this string
  }
  StringEntry[];
};

Parsing a decrypted blob using this structure for reference caused a few headaches and required minor changes. In IEFrame.dll, CryptProtectData() is used with URL as entropy to encrypt StringIndex + credentials.

The next problem is discovering the original URL used as entropy and this is what makes IE password algorithm quite good..

Obtaining URLs

There are a number of ways to harvest URLs for the purpose of recovering IE7-IE9 passwords. The Cache normally has a list of websites visited which can be enumerated.
Here’s one such way using COM

void EnumCache1() {
  HRESULT hr = CoInitialize(NULL);
  
  if (SUCCEEDED(hr)) {
    IUrlHistoryStg2 *pHistory = NULL;
    hr = CoCreateInstance(CLSID_CUrlHistory, NULL, 
        CLSCTX_INPROC_SERVER, 
        IID_IUrlHistoryStg2,(void**)(&pHistory));
    
    if (SUCCEEDED(hr)) {
      IEnumSTATURL *pUrls = NULL;
      hr = pHistory->EnumUrls(&pUrls);
            
      if (SUCCEEDED(hr)) {
        while (TRUE) {
          STATURL st;
          ULONG result;
          
          hr = pUrls->Next(1, &st, &result);
          
          if (SUCCEEDED(hr) && result == 1) {
           
            AddUrl(st.pwcsUrl);
            
          } else {
            break;
          }
        }
        pUrls->Release();
      }
      pHistory->Release();
    }
    CoUninitialize();
  }  
}

And another using WININET API

void EnumCache2()   
{   
  HANDLE hEntry;   
  DWORD dwSize;
  BYTE buffer[8192];
  LPINTERNET_CACHE_ENTRY_INFO info = (LPINTERNET_CACHE_ENTRY_INFO) buffer;
  
  dwSize = 8192;
  hEntry = FindFirstUrlCacheEntry(NULL, info, &dwSize);
  
  if (hEntry != NULL) {
    do {
      if (info->CacheEntryType != COOKIE_CACHE_ENTRY) {
        AddUrl(info->lpszSourceUrlName);
      }
      dwSize = 8192;    
    } while (FindNextUrlCacheEntry(hEntry, info, &dwSize));   
    FindCloseUrlCache(hEntry);
  }
}

To take things a bit further, you could also parse index.dat files but I won’t go into that here since it’s in the realm of forensics.
A better approach is probably reading a list of URLs harvested from the internet.

Recovery

Recovery is close to how IE7-IE9 process decrypts entries except we’re forcing the decryption process using a list of URL.
The following collects a list of auto complete entries


#define MAX_URL_HASH 255
#define MAX_URL_DATA 8192

typedef struct _IE_STORAGE_ENTRY {
  std::wstring UrlHash;
  DWORD cbData;
  BYTE pbData[MAX_URL_DATA];
} IE_STORAGE_ENTRY, *PIE_STORAGE_ENTRY;

DWORD GetAutocompleteEntries() {

  HKEY hKey;
  DWORD dwResult;
  
  dwResult = RegOpenKeyEx(HKEY_CURRENT_USER, 
      L"Software\\Microsoft\\Internet Explorer\\IntelliForms\\Storage2",
      0, KEY_QUERY_VALUE, &hKey);
  
  if (dwResult == ERROR_SUCCESS) {
    DWORD dwIndex = 0;
    
    while (TRUE) {
      IE_STORAGE_ENTRY entry;
      
      DWORD cbUrl = MAX_URL_HASH;
      wchar_t UrlHash[MAX_URL_HASH];
      
      entry.cbData = MAX_URL_DATA;
      
      dwResult = RegEnumValue(hKey, dwIndex, UrlHash, &cbUrl, 
          NULL, 0, entry.pbData, &entry.cbData);
      
      if (dwResult == ERROR_SUCCESS) {
        entry.UrlHash = UrlHash;
        ac_entries.push_back(entry);
      } else if (dwResult == ERROR_NO_MORE_ITEMS) {
        break;
      }
      dwIndex++;
    }
    RegCloseKey(hKey);
  }  
  return ac_entries.size();
}

Now with list of URL strings and autocomplete entries, we can attempt to decrypt using CryptUnprotectData() The decrypted data is then parsed based on the StringIndex structure.

void ParseBlob(PBYTE pInfo, const wchar_t *url) {
  
  StringIndex* pBlob = (StringIndex*)pInfo;

  // get offset of data
  PBYTE pse = (PBYTE)&pInfo[pBlob->cbHdrSize + pBlob->cbStringSize1];
  
  // process 2 entries for each login
  for (DWORD i = 0;i < pBlob->dwNumStrings;i += 2) {
  
    // get username and password
    wchar_t *username = (wchar_t*)&pse[pBlob->StringEntry[i + 0].dwStringPtr];
    wchar_t *password = (wchar_t*)&pse[pBlob->StringEntry[i + 1].dwStringPtr];
    
    bool bTime;
    wchar_t last_logon[MAX_PATH];
    
    if (lstrlen(password) > 0) {
      // get last time this was used
      FILETIME ft;
      SYSTEMTIME st;

      FileTimeToLocalFileTime(&pBlob->StringEntry[i].ftLastSubmitted, &ft);
      FileTimeToSystemTime(&ft, &st);

      bTime = (GetDateFormatW(LOCALE_SYSTEM_DEFAULT, 0, &st, L"MM/dd/yyyy", last_logon, MAX_PATH) > 0);
    } else {
      bTime = false;
    }
    wprintf(L"\n%-30s  %-20s  %-15s %s", username, password, bTime ? last_logon : L"NEVER", url);
  }
}

Conclusion

Because the URL is used as entropy, that can be problemtatic recovering all autocomplete entries.
It would be simple to recover credentials of popular services like Facebook, Gmail, Instagram, Hotmail..etc but the less well known services would be problem unless URL was stored in cache.