Sunday, October 2, 2022
HomePythonExtracting WhatsApp messages from an iOS backup

Extracting WhatsApp messages from an iOS backup


Hello everybody! đź‘‹ I used to be lately exploring methods to get a neighborhood backup of WhatsApp messages from my iPhone. I switched from Android to iOS previously and misplaced all of my WhatsApp messages. I needed to ensure that if I switched once more from iOS to Android I don’t lose any messages. I don’t actually care if I can import the messages in WhatsApp. I simply don’t need to lose the entire necessary info I’ve in my chats. I don’t have any instant plans for switching (if ever) nevertheless it appeared like a enjoyable problem and so I began surveying the out there instruments and the way they work.

This was principally a studying train for me relating to how Apple shops iOS backups and the way I can selectively extract info and information from one. My goal was to have a neighborhood copy of WhatsApp messages that I can learn and search by means of domestically. It could be doubly superior if I can transfer the messages to an Android system however, as I discussed earlier than, that wasn’t my primary purpose.

Exploring iOS backup

By default, if you create an iOS backup on Mac (Catalina in my case), it’s saved beneath ~/Library/Software Help/MobileSync/Backup/. This folder comprises sub-folders with distinctive system identifiers. Every sub-folder is a backup and comprises a bunch of further subfolders together with the next 4 necessary information:

  • Information.plist
  • Manifest.db
  • Manifest.plist
  • Standing.plist

We primarily care about each of the Manifest information.

The Manifest.plist file is a binary Property Record file that comprises details about the backup. It comprises:

  • Backup keybag: The Backup keybag comprises a set of information safety class keys which might be totally different from the keys within the System keybag, and backed-up information is re-encrypted with the brand new class keys. Keys within the Backup keybag facilitate the safe storage of backups. We’ll find out about safety lessons later
  • Date: That is the timestamp of a backup created or final up to date
  • ManifestKey: That is the important thing used to encrypt Manifest.db (wrapped with safety class 4)
  • WasPasscodeSet: This identifies whether or not a passcode was set on the system when it was final synced
  • And way more…

Supply: O’Reilly + Richinfante

Whereas, the Manifest.db file comprises all of the juicy information concerning the information within the backup and their paths. The one drawback is that the Manifest.db file is encrypted and we have to use the knowledge from the Manifest.plist file to decrypt it. If the backup was not encrypted, we might have in all probability gotten away with out making use of the Manifest.plist file.

We will confirm that the db file is encrypted by opening it in any SQL db viewer. I used “DB Browser for SQLite” and it confirmed me this display:

SQLCipher Encryption

This clearly exhibits that the db is encrypted. Later we’ll see that not solely is the DB encrypted, however each file can be encrypted with its personal random per-file encryption key.

Decrypting the Manifest.db file

The fundamental decryption course of is as follows:

  1. Decode the keybag saved within the BackupKeyBag entry of Manifest.plist. A high-level overview of this construction is given within the iOS Safety Whitepaper. The iPhone Wiki describes the binary format: a 4-byte string sort discipline, a 4-byte big-endian size discipline, after which the worth itself.

The necessary values are the PBKDF2 ITERations and SALT, the double safety salt DPSL and iteration rely DPIC, after which for every safety CLS, the WPKY wrapped key.

  1. Utilizing the backup password derive a 32-byte key utilizing the proper PBKDF2 salt and variety of iterations. First, use a SHA256 spherical with DPSL and DPIC, then a SHA1 spherical with ITER and SALT.

Unwrap every wrapped key based on RFC 3394.

  1. Decrypt the manifest database by pulling the 4-byte safety class and longer key from the ManifestKey in Manifest.plist, and unwrapping it. You now have a SQLite database with all file metadata.

  2. For every file of curiosity, get the class-encrypted per-file encryption key and safety class code by trying within the Information.file database column for a binary plist containing EncryptionKey and ProtectionClass entries. Strip the preliminary four-byte size tag from EncryptionKey earlier than utilizing.

Then, derive the ultimate decryption key by unwrapping it with the category key that was unwrapped with the backup password. Then decrypt the file utilizing AES in CBC mode with a zero IV.

Supply: StackOverflow

If safety lessons and double safety doesn’t make a lot sense, I’d extremely suggest studying the iOS Safety Whitepaper from web page 12 onwards. It supplies particulars about all of this and why iOS makes use of these safety lessons.

For those who don’t know what a Keybag is, Apple has first rate documentation:

A knowledge construction used to retailer a group of sophistication keys. Every sort (person, system, system, backup, escrow, or iCloud Backup) has the identical format.

A header containing: Model (set to 4 in iOS 12 or later), Sort (system, backup, escrow, or iCloud Backup), Keybag UUID, an HMAC if the keybag is signed, and the tactic used for wrapping the category keys—tangling with the UID or PBKDF2, together with the salt and iteration rely.

An inventory of sophistication keys: Key UUID, Class (which file or Keychain Knowledge Safety class), wrapping sort (UID-derived key solely; UID-derived key and passcode-derived key), wrapped class key, and a public key for uneven lessons.

We will learn the Manifest.plist file in Python utilizing the biplist module. You may set up it utilizing pip:

pip set up biplist

After which use it like this:

from biplist import readPlist
import os

backup_directory = os.path.expanduser("~/Library/Software Help/MobileSync/Backup/<unique_id>")
plist_path = os.path.be part of(backup_directory, "Manifest.plist")
plist = readPlist("Manifest.plist")

Word: Don’t neglect to exchange <unique_id> with the identify of you explicit system backup folder.

That is what the plist contents would appear to be:

Manifest.plist

From this dict, we require the backupKeyBag and ManifestKey. It can assist us decrypt the Manifest.db file. The BackupKeybag is a binary string with the next format:

  • 4-byte block identifier
  • 4-byte block size (most vital byte first), size 4 means complete block size of 0xC bytes.
  • information

The primary block is “VERS” with a model variety of 3. There are numerous block varieties: VERS, TYPE, UUID, HMCK, WRAP, SALT, ITER, UUID, CLAS, WRAP, KTYP, WPKY, and so forth.

Supply: IPhone Wiki

Decrypting the keybag

There are fairly a number of sources out there on-line that present you how one can decrypt the keybag. It makes use of PBKDF2 for key technology and AES for encryption. You may check out this StackOverflow reply for working Python code to decrypt the keybag. I might be making use of the code from that reply.

There are a bunch of various safety lessons. The one used for the manifest database is class 3. We will discover this by studying the primary 4 bytes of the ManifestKey worth in our Manifest.plist file:

import struct
manifest_class = struct.unpack('<l', plist['ManifestKey'][:4])[0]
# Output: 3

I encrypted my iOS backup. That is useful as a result of Apple doesn’t again up delicate information except the backup is encrypted. Delicate information contains stuff like WiFi passwords. Now we are able to use the code from StackOverflow, the preliminary backup encryption passphrase you used whereas creating the backup, and the remainder of the ManifestKey from the Manifest.plist to decrypt the Manifest.db file:

manifest_key = plist['ManifestKey'][4:]

kb = Keybag(plist['BackupKeyBag'])
kb.unlockWithPassphrase('passphrase')
key = kb.unwrapKeyForClass(manifest_class, manifest_key)

with open('Manifest.db', 'rb') as f:
    encrypted_db = f.learn()

decrypted_data = AESdecryptCBC(encrypted_db, key)

with open('decrypted_manifest.db', 'wb') as f:
    f.write(decrypted_data)

As you’ll be able to see above, when you don’t keep in mind the passphrase you used whereas backing up your iOS system, you cannot decrypt something. It’s essential to proceed the remainder of the decryption course of.

Now if we attempt to open the decrypted_manifest.db in a SQL viewer we are able to see the precise information:

decrypted manifest.plist

We will seek for all information related to WhatsApp by doing a world seek for WhatsApp. The chats are saved in a ChatStorage.sqlite file:

whatsapp-manifest-plist

We will get this document utilizing Python:

import sqlite3

db_conn = sqlite3.join('decrypted_manifest.db')
relative_path = "Chatstorage.sqlite"
question = """
    SELECT fileID, file
    FROM Information
    WHERE relativePath = ?
    ORDER BY area, relativePath
    LIMIT 1;
"""
cur = db_conn.cursor()
cur.execute(question, (relative_path,))
end result = cur.fetchone()
file_id, file_bplist = end result

One factor to notice is that the fileID is made up of a hash of the area + file identify so it will in all probability be the identical for you. It’s generated like this:

import hashlib

area = "AppDomainGroup-group.internet.whatsapp.WhatsApp.shared"
relative_path = "ChatStorage.sqlite"
hash = hashlib.sha1(f"{area}-{relative_path}".encode()).hexdigest()

# hash = 7c7fba66680ef796b916b067077cc246adacf01d

The document within the db comprises the binary plist file related to ChatStorage.sqlite file. We bought a maintain of that by working the above question. We will have a look inside by utilizing the readPlistFromString technique of the biplist module and extract the required info:

from biplist import readPlistFromString
file_plist = readPlistFromString(file_bplist)

# print(file_plist)

# {'$archiver': 'NSKeyedArchiver',
#  '$objects': ['$null',
#               {'$class': Uid(5),
#                'Birth': 1617036196,
#                'EncryptionKey': Uid(3),
#                'Flags': 0,
#                'GroupID': 501,
#                'InodeNumber': 45839007,
#                'LastModified': 1650483880,
#                'LastStatusChange': 1650481761,
#                'Mode': 33188,
#                'ProtectionClass': 3,
#                'RelativePath': Uid(2),
#                'Size': 22056960,
#                'UserID': 501},
#               'ChatStorage.sqlite',
#               {'$class': Uid(4),
#                'NS.data': b'x03x00x00x00tE1xd1Hn"ex06xf7x1cl'
#                           b'x82xedx05xe7x1dx1cxd6x97x0exe9x8b"'
#                           b'xfax16x93x9c3x18xbenx14x1eR;fx98xe3v'},
#               {'$classes': ['NSMutableData', 'NSData', 'NSObject'],
#                '$classname': 'NSMutableData'},
#               {'$lessons': ['MBFile', 'NSObject'], '$classname': 'MBFile'}],
#  '$prime': {'root': Uid(1)},
#  '$model': 100000}
file_data = file_plist['$objects'][file_plist['$top']['root'].integer]
protection_class = file_data['ProtectionClass']

encryption_key = file_plist['$objects'][file_data['EncryptionKey'].integer]['NS.data'][4:]

# file_data
# {'$class': Uid(5),
#  'Beginning': 1617036196,
#  'EncryptionKey': Uid(3),
#  'Flags': 0,
#  'GroupID': 501,
#  'InodeNumber': 45839007,
#  'LastModified': 1650483880,
#  'LastStatusChange': 1650481761,
#  'Mode': 33188,
#  'ProtectionClass': 3,
#  'RelativePath': Uid(2),
#  'Measurement': 22056960,
#  'UserID': 501}

# protection_class
# 3

# encryption_key
# ---truncated---

Now we have to use the keybag class (kb) to unwrap the encryption key from above for the required safety class (3):

file_decryption_key = kb.unwrapKeyForClass(protection_class, encryption_key)

Decrypting ChatStorage.sqlite

Candy! All that’s left is to decrypt the precise chat db. However the place is it saved? Apple shops information within the backup folder in a predictable format. It places them in a subdirectory with the identify beginning with the primary two characters of fileID (eg 7c/7c7fba66680ef796b916b067077cc246adacf01d). We will get the complete path to the chat db file like this:

filename_in_backup = os.path.be part of(backup_directory, file_id[:2], file_id)

This may permit us to open the encrypted file and decrypt it utilizing the file_decryption_key we extracted above:

with open(filename_in_backup, 'rb') as encrypted_file:
    encrypted_data = encrypted_file.learn()

decrypted_data = AESdecryptCBC(encrypted_data, file_decryption_key)

Word: This AESdecryptCBC perform is part of the code we bought from StackOverflow

Generally the encryption introduces padding on the finish of the information to make it a a number of of the blocksize. So we want to ensure we take away any padding from the tip of the information as nicely:

def removePadding(information, blocksize=16):
    n = int(information[-1])  # RFC 1423: final byte comprises variety of padding bytes.
    if n > blocksize or n > len(information):
        elevate Exception('Invalid CBC padding')
    return information[:-n]
    
decrypted_data = removePadding(decrypted_data)

We will save this decrypted information in a brand new SQLite file:

with open('decrypted_ChatStorage.sqlite', 'wb') as f:
    f.write(decrypted_data)

If we now open this new file in a SQLite browser, we are able to see all of the tables:

WhatsApp Messages Table

The chats are saved within the ZWAMESSAGE desk:

WhatsApp Messages

If you’re searching for all of the media information that have been despatched with messages, you’ll have to return to the decrypted Manifest.db file and filter for media information saved beneath Message/Media:

media manifest.plist

You should utilize the next SQL question to get all of those media information:

"""
SELECT fileID,
       relativePath,
       flags,
       file
FROM Information
WHERE relativePath
    LIKE 'Message/Media/%'
"""

Now right here comes the most effective half. You don’t should do any of this your self. There may be already a Python program on the market that may parse by means of your iOS backup, obtain all of the media information, chats, and speak to checklist, and convert them into HTML format. This fashion you’ll be able to learn your chats with out porting the backup right into a WhatsApp shopper.

Whatsapp-Chat-Exporter works with iOS and Android ✨

I used this device to finally convert all of my WhatsApp messages into HTML format for straightforward shopping on my laptop computer.

Helpful Sources

I took some assist from a bunch of various sources whereas writing this text. You may undergo them to get a deeper understanding of a few of the stuff talked about on this article:

Conclusion

I hope you realized a factor or two from this text. I had a enjoyable time diving into the weeds of iOS backups. I had no concept how Apple was storing the backup and the way straightforward/laborious it was going to be to get the actual file I needed from that backup. Suffice to say it wasn’t too laborious and taught me a number of enjoyable issues within the course of.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments