I got launchd
and recoveryd
to start on an emulated iPhone running iOS 12 beta 4’s kernel using a modified QEMU. Here’s what I learned, and how you can try this yourself.
Qemu emulator free download. PICSimLab - Prog. IC Simulator Lab. PICSimLab is a realtime emulator of development boards with integrated MPLABX/avr-gdb debugger. 8 9 In addition, Android 2. Sh to download installation Run qemu-img create -f qcow2 machddng. Msi (I haven’t had much luck with the x64 version) Reboot the OS from within the VM. Posted on April 21, 2018 by neozeed. Then, clone my qemu-raspbian-network repository, download a raspbian image and launch qemu-pi. QEMU can be built on Windows, but their instructions doesn’t seem to work for this modified QEMU. Please build on macOS or Linux instead. Please build on macOS or Linux instead. You can set up a virtual machine running Ubuntu 18.04 with Virtualbox or VMWare Player.
Introduction
This is Part 2 of a series on the iOS boot process. Part 1 is here. Sign up with your email to be the first to read new posts.
skip to: tutorial, writeup
First, let me repeat: this is completely useless unless you’re really interested in iOS internals. If you want to run iOS, you should ask @CorelliumHQ instead, or just buy an iPhone.
I’ve been interested in how iOS starts, so I’ve been trying to boot the iOS kernel in QEMU.
I was inspired by @cmwdotme’s Corellium, a service which can boot any iOS in a virtual machine. Since I don’t have 9 years to build a perfect simulation of an iPhone, I decided to go for a less lofty goal: getting enough of iOS emulated until launchd
, the first program to run when iOS boots, is able to start.
Since last week’s post, I got the iOS 12 beta 4 kernel to fully boot in QEMU, and even got it to run launchd
and start recoveryd
from the restore ramdisk. Here’s the output from the virtual serial port:
If you would like to examine iOS’s boot process yourself, here’s how you can try it out.
Building QEMU
The emulation uses a patched copy of QEMU, which must be compiled from source.
Install dependencies
To compile QEMU, you first need to install some libraries.
macOS:
According to the QEMU wiki and the Homebrew recipe, you need to install Xcode and Homebrew, then run
brew install pkg-config libtool jpeg glib pixman
to install the required libraries to compile QEMU.
Ubuntu 18.04:
According to the QEMU wiki, run
sudo apt install libglib2.0-dev libfdt-dev libpixman-1-dev zlib1g-dev libsdl1.2-dev
to install the required libraries to compile QEMU.
Windows:
QEMU can be built on Windows, but their instructions doesn’t seem to work for this modified QEMU. Please build on macOS or Linux instead. You can set up a virtual machine running Ubuntu 18.04 with Virtualbox or VMWare Player.
Download and build source
Open a terminal, and run
Preparing iOS files for QEMU
Once QEMU is compiled, you need to obtain the required iOS kernelcache, device tree, and ramdisk.
If you don’t want to extract these files yourself, I packaged all the files you need from iOS 12 beta 4. You can download this archive if you sign up for my mailing list.
If you want to extract your own files directly from an iOS update, here’s how:
1. Download the required files:
- Download my XNUQEMUScripts repository:
Download the iOS 12 beta 4 for iPhone X.
To decompress the kernel, download newosxbook’s Joker tool.
2. Extract the kernel using Joker:
replace joker.universal
with joker.ELF64
if you are using Linux.
3. extract the ramdisk:
4. Modify the devicetree.dtb file:
Installing a debugger
You will also need lldb or gdb for arm64 installed.
macOS
The version of lldb included in Xcode 9.3 should work. (Later versions should also work.) You don’t need to install anything in this step.
Ubuntu 18.04
I can’t find an LLDB compatible with ARM64: neither the LLDB from the Ubuntu repository nor the version from LLVM’s own repos support ARM64. (Someone please build one!)
Instead, you can use GDB on Linux.
Two versions of GDB can be used: the version from devkitA64, or the Linaro GDB (recommended).
Enter your xnuqemu directory (from the downloaded package or from the clone of the XNUQEMUScripts repo)
Run
to download the Linaro GDB.
Running QEMU
Place your qemu
directory into the same directory as the scripts, kernel, devicetree, and ramdisk.
You should have these files:
./runqemu.sh
to start QEMU.
in a different terminal, ./lldbit.sh
to start lldb, or if you’re using Linux, ./gdbit.sh
to start gdb.
Type c
into lldb or gdb to start execution.
In the terminal running QEMU, you should see boot messages. Congratulations, you’ve just ran a tiny bit of iOS with a virtual iPhone! Or as UnthreadedJB would say, “#we r of #fakr!”
What works
- Booting XNU all the way to running userspace programs
- Console output from virtual serial port
What doesn’t work
- Wi-Fi
- Bluetooth
- USB
- Screen
- Internal storage
- Everything except the serial port
Seriously, though, this only runs a tiny bit of iOS, and is nowhere close to iOS emulation. To borrow a simile from the creator of Corellium, if Corellium is a DeLorean time machine, then this is half a wheel at most.
This experiment only finished the easy part of booting iOS, as it doesn’t emulate an iPhone at all, relying on only the parts common to all ARM devices. No drivers are loaded whatsoever, so there’s no emulation of the screen, the USB, the internal storage… You name it: it doesn’t work.
For full iOS emulation, the next step would be reverse engineering the iPhone’s SoC to find out how its peripherals work. Unfortunately, that’s a 9-year project, as shown by the development history of Corellium. I can’t do that on my own - that’s why I wrote this tutorial!
It’s my hope that this work inspires others to look into proper iOS emulation - from what I’ve seen, it’ll be a great learning experience.
How I did this
Last week, I started modifying QEMU to load an iOS kernel and device tree: the previous writeup is here. Here’s how I got from crashing when loading kernel modules to fully booting the kernel.
Tweaking CPU emulation, part 3: Interrupting cow
When we left off, the kernel crashed with a data abort when it tries to bzero
a write only region of memory. Why?
To confirm that it’s indeed writing to read-only memory, I implemented a command to dump out the kernel memory mappings, and enabled QEMU’s verbose MMU logging to detect changes to the memory map.
I tracked down the crashing code to OSKext::updateLoadedKextSummaries
. After every kext load, this code resets the kext summaries region to writable with vm_map_protect
, writes information for the new kext, then sets the region back to read-only. The logs show that the call to protect the region modifies the memory mappings, but the call to reset it to read-write doesn’t do anything. Why isn’t it setting the page to writable?
According to comments in vm_map_protect
, it turns out that readonly->readwrite calls actually don’t change the protection immediately, but only sets it on-demand when a program tries - and fails - to write to the page. This is to implement copy on write.
So, it seems the data abort exception is supposed to happen, but the panic is not.
In the data abort exception, the page should be set to writable in arm_fast_fault
. The code in open-source XNU can only return KERN_FAILURE or KERN_SUCCESS, but with a breakpoint, I saw it was returning KERN_PROTECTION_FAILURE.
I checked the disassembly: yes, there’s extra code (0xFFFFFFF0071F953C
in iOS 12 beta 4) returning KERN_PROTECTION_FAILURE if the page address doesn’t match one of the new KTRR registers added on the A11 processor .
I had been ignoring all writes to KTRR registers, so this code can’t read the value from the register (which the kernel stored at startup), and believes that all addresses are invalid. Thus, instead of setting the page to writable, the kernel panics instead.
I fixed this by adding these registers to QEMU’s virtual CPU, allowing the kernel to read and write them.
After this change, a few more kexts started up, but the kernel then hangs… like it’s waiting for something.
Connecting the timer interrupt
My hunch for why the kernel hangs: one of the kexts tries to sleep for some time during initialization, but never wakes up because there are no timer interrupts, as shown by QEMU not logging any exceptions when it hangs.
On ARM, there are two ways for hardware to signal the CPU: IRQ, shared by many devices, or FIQ, dedicated to just one device.
QEMU’s virt
machine hooks up the processor’s timer to IRQ, like most real ARM platforms. FIQ is usually reserved for debuggers.
Apple, however, hooks up the timer directly to the FIQ. With virt
’s timer hooked up to the wrong signal, the kernel would wait forever for an interrupt that would never come.
All I had to do to get the timer working was to hook it up to FIQ. This gets me… a nice panic in the Image4 parser.
Getting the Image4 parser module working
What does this mean? What’s error 0x60?
I found the panic string, and looked for where the error message is generated.
It turns out that the Image4 parser queries the device tree for various nodes in “/chosen” or “/default”; if the value doesn’t exist, it returns error 0x60. If the value is the wrong size, it returns 0x54.
iOS’s device tree is missing two properties: chip-epoch
and security-domain
, which causes the module to panic with the 0x60 error.
Oddly, the device tree doesn’t reserve extra space for these properties. I had to delete two existing properties to make space for them.
With the modified device tree, the Image4 module initializes, but now I have a panic from a data abort in rorgn_lockdown.
Failed attempt to get device drivers to not crash
Of course the KTRR driver crashes when it tries to access the memory controller: there isn’t one! QEMU’s virt
machine doesn’t have anything mapped at that address.
Since I don’t have an emulation of the memory controller, I just added a block of empty memory to avoid the crash.
This strategy didn’t work for the next crash, though, from the AppleInterruptController driver. That driver reads and validates values from the device, so just placing a blank block of memory causes the driver to panic.
Something more drastic is needed if I don’t want to spend 9 years reverse engineering each driver.
Driverless like we’re Waymo
To boot XNU, I don’t really need all those drivers, do I? Who needs interrupts or the screen or power management or storage, anyways? All XNU needs to boot into userspace is a serial port and a timer.
I disabled every other driver in the kernel. Drivers are loaded if their IONameMatch
property corresponds to a device’s “compatible”, “name”, or “device_type” fields. To disable all the drivers, I erased every “compatible” property in the device tree, along with a few “name” and “device_type” properties.
Now, with no drivers, XNU seems to hang, but after I patiently waited for a minute…
It’s trying to mount the root filesystem!
Loading a RAMDisk
If it’s looking for a root filesystm, let’s give it one. I don’t have any drivers for storage, but I can mount an iOS Recovery RAMDisk, which requires no drivers. All I had to do was:
- Load the ramdisk at the end of the kernel, just before the device tree blob
- put its address and size in the device tree so XNU can find it
- set boot argument to
rd=md0
to boot from ramdisk
The kernel mounts the root filesystem! … but then hangs again.
Using LLDB to patch out hanging functions
By putting breakpoints all over bsd_init
, I found that the kernel was hanging in IOBSDSecureRoot
, when it tries to call the platform function. The platform function looks for a device, but since I removed all the device drivers, it waits forever, in vain.
To fix this, I just skipped the problematic call. I used an LLDB breakpoint to jump over the call and simulate a true
return instead.
And, after three weeks, the virtual serial port finally printed out:
“Houston, the kernel has booted.”
What I learned
- quirks of iOS memory management
- how iOS handles timer interrupts
- how iOS loads ramdisks
- building QEMU on different platforms
- modifying QEMU to add new CPU configuration registers
- differences between GDB and LLDB’s command syntax
- how to get people to subscribe to my mailing list. (muhahaha, one last signup link.)
Thanks
Thanks to everyone who shared or commented on my last article. To those who tried building and running it - sorry about taking so long to write up instructions!
Thanks to @matteyeux, @h3adsh0tzz, @_th0ex, and @enzolovesbacon for testing the build instructions.
Thanks to @winocm, whose darwin-on-arm project originally inspired me to learn about the XNU kernel.
QEMU supports many disk image formats, including growable disk images(their size increase as non empty sectors are written), compressed andencrypted disk images.
Quick start for disk image creation¶
You can create a disk image with the command:
where myimage.img is the disk image filename and mysize is its size inkilobytes. You can add an M
suffix to give the size in megabytes anda G
suffix for gigabytes.
See the qemu-img invocation documentation for more information.
Snapshot mode¶
If you use the option -snapshot
, all disk images are considered asread only. When sectors in written, they are written in a temporary filecreated in /tmp
. You can however force the write back to the rawdisk images by using the commit
monitor command (or C-a s in theserial console).
VM snapshots¶
VM snapshots are snapshots of the complete virtual machine including CPUstate, RAM, device state and the content of all the writable disks. Inorder to use VM snapshots, you must have at least one non removable andwritable block device using the qcow2
disk image format. Normallythis device is the first virtual hard drive.
Use the monitor command savevm
to create a new VM snapshot orreplace an existing one. A human readable name can be assigned to eachsnapshot in addition to its numerical ID.
Use loadvm
to restore a VM snapshot and delvm
to remove a VMsnapshot. infosnapshots
lists the available snapshots with theirassociated information:
A VM snapshot is made of a VM state info (its size is shown ininfosnapshots
) and a snapshot of every writable disk image. The VMstate info is stored in the first qcow2
non removable and writableblock device. The disk image snapshots are stored in every disk image.The size of a snapshot in a disk image is difficult to evaluate and isnot shown by infosnapshots
because the associated disk sectors areshared among all the snapshots to save disk space (otherwise eachsnapshot would need a full copy of all the disk images).
When using the (unrelated) -snapshot
option(Snapshot mode),you can always make VM snapshots, but they are deleted as soon as youexit QEMU.
VM snapshots currently have the following known limitations:
- They cannot cope with removable devices if they are removed orinserted after a snapshot is done.
- A few device drivers still have incomplete snapshot support so theirstate is not saved or restored properly (in particular USB).
Disk image file formats¶
QEMU supports many image file formats that can be used with VMs as well as withany of the tools (like qemu-img
). This includes the preferred formatsraw and qcow2 as well as formats that are supported for compatibility witholder QEMU versions or other hypervisors.
Depending on the image format, different options can be passed toqemu-imgcreate
and qemu-imgconvert
using the -o
option.This section describes each format and the options that are supported for it.
raw
¶Raw disk image format. This format has the advantage ofbeing simple and easily exportable to all other emulators. If yourfile system supports holes (for example in ext2 or ext3 onLinux or NTFS on Windows), then only the written sectors will reservespace. Use qemu-imginfo
to know the real size used by theimage or ls-ls
on Unix/Linux.
Supported options:
preallocation
¶Preallocation mode (allowed values: off
, falloc
,full
). falloc
mode preallocates space for image bycalling posix_fallocate()
. full
mode preallocates spacefor image by writing data to underlying storage. This data may ormay not be zero, depending on the storage location.
qcow2
¶QEMU image format, the most versatile format. Use it to have smallerimages (useful if your filesystem does not supports holes, for exampleon Windows), zlib based compression and support of multiple VMsnapshots.
Supported options:
compat
¶Determines the qcow2 version to use. compat=0.10
uses thetraditional image format that can be read by any QEMU since 0.10.compat=1.1
enables image format extensions that only QEMU 1.1 andnewer understand (this is the default). Amongst others, this includeszero clusters, which allow efficient copy-on-read for sparse images.
backing_file
¶File name of a base image (see create
subcommand)
backing_fmt
¶Image format of the base image
encryption
¶This option is deprecated and equivalent to encrypt.format=aes
encrypt.format
¶If this is set to luks
, it requests that the qcow2 payload (notqcow2 header) be encrypted using the LUKS format. The passphrase touse to unlock the LUKS key slot is given by the encrypt.key-secret
parameter. LUKS encryption parameters can be tuned with the otherencrypt.*
parameters.
If this is set to aes
, the image is encrypted with 128-bit AES-CBC.The encryption key is given by the encrypt.key-secret
parameter.This encryption format is considered to be flawed by modern cryptographystandards, suffering from a number of design problems:
- The AES-CBC cipher is used with predictable initialization vectors basedon the sector number. This makes it vulnerable to chosen plaintext attackswhich can reveal the existence of encrypted data.
- The user passphrase is directly used as the encryption key. A poorlychosen or short passphrase will compromise the security of the encryption.
- In the event of the passphrase being compromised there is no way tochange the passphrase to protect data in any qcow images. The files mustbe cloned, using a different encryption passphrase in the new file. Theoriginal file must then be securely erased using a program like shred,though even this is ineffective with many modern storage technologies.
The use of this is no longer supported in system emulators. Support onlyremains in the command line utilities, for the purposes of data liberationand interoperability with old versions of QEMU. The luks
formatshould be used instead.
encrypt.key-secret
¶Provides the ID of a secret
object that contains the passphrase(encrypt.format=luks
) or encryption key (encrypt.format=aes
).
encrypt.cipher-alg
¶Name of the cipher algorithm and key length. Currently defaultsto aes-256
. Only used when encrypt.format=luks
.
encrypt.cipher-mode
¶Name of the encryption mode to use. Currently defaults to xts
.Only used when encrypt.format=luks
.
encrypt.ivgen-alg
¶Name of the initialization vector generator algorithm. Currently defaultsto plain64
. Only used when encrypt.format=luks
.
encrypt.ivgen-hash-alg
¶Name of the hash algorithm to use with the initialization vector generator(if required). Defaults to sha256
. Only used when encrypt.format=luks
.
encrypt.hash-alg
¶Name of the hash algorithm to use for PBKDF algorithmDefaults to sha256
. Only used when encrypt.format=luks
.
encrypt.iter-time
¶Amount of time, in milliseconds, to use for PBKDF algorithm per key slot.Defaults to 2000
. Only used when encrypt.format=luks
.
cluster_size
¶Changes the qcow2 cluster size (must be between 512 and 2M). Smaller clustersizes can improve the image file size whereas larger cluster sizes generallyprovide better performance.
preallocation
¶Preallocation mode (allowed values: off
, metadata
, falloc
,full
). An image with preallocated metadata is initially larger but canimprove performance when the image needs to grow. falloc
and full
preallocations are like the same options of raw
format, but sets upmetadata also.
lazy_refcounts
¶If this option is set to on
, reference count updates are postponed withthe goal of avoiding metadata I/O and improving performance. This isparticularly interesting with cache=writethrough
which doesn’t batchmetadata updates. The tradeoff is that after a host crash, the reference counttables must be rebuilt, i.e. on the next open an (automatic) qemu-imgcheck-rall
is required, which may take some time.
This option can only be enabled if compat=1.1
is specified.
nocow
¶If this option is set to on
, it will turn off COW of the file. It’s onlyvalid on btrfs, no effect on other file systems.
Btrfs has low performance when hosting a VM image file, even morewhen the guest on the VM also using btrfs as file system. Turning offCOW is a way to mitigate this bad performance. Generally there are twoways to turn off COW on btrfs:
- Disable it by mounting with nodatacow, then all newly created fileswill be NOCOW.
- For an empty file, add the NOCOW file attribute. That’s what thisoption does.
Note: this option is only valid to new or empty files. If there isan existing file which is COW and has data blocks already, it couldn’tbe changed to NOCOW by setting nocow=on
. One can issue lsattrfilename
to check if the NOCOW flag is set or not (Capital ‘C’ isNOCOW flag).
qed
¶Old QEMU image format with support for backing files and compact image files(when your filesystem or transport medium does not support holes).
When converting QED images to qcow2, you might want to consider using thelazy_refcounts=on
option to get a more QED-like behaviour.
Supported options:
backing_file
¶File name of a base image (see create
subcommand).
backing_fmt
¶Image file format of backing file (optional). Useful if the format cannot beautodetected because it has no header, like some vhd/vpc files.
cluster_size
¶Changes the cluster size (must be power-of-2 between 4K and 64K). Smallercluster sizes can improve the image file size whereas larger cluster sizesgenerally provide better performance.
table_size
¶Changes the number of clusters per L1/L2 table (must bepower-of-2 between 1 and 16). There is normally no need tochange this value but this option can between used forperformance benchmarking.
qcow
¶Old QEMU image format with support for backing files, compact image files,encryption and compression.
Supported options:
backing_file
¶File name of a base image (see create
subcommand)
encryption
¶This option is deprecated and equivalent to encrypt.format=aes
encrypt.format
¶If this is set to aes
, the image is encrypted with 128-bit AES-CBC.The encryption key is given by the encrypt.key-secret
parameter.This encryption format is considered to be flawed by modern cryptographystandards, suffering from a number of design problems enumerated previouslyagainst the qcow2
image format.
The use of this is no longer supported in system emulators. Support onlyremains in the command line utilities, for the purposes of data liberationand interoperability with old versions of QEMU.
Users requiring native encryption should use the qcow2
formatinstead with encrypt.format=luks
.
encrypt.key-secret
¶Provides the ID of a secret
object that contains the encryptionkey (encrypt.format=aes
).
luks
¶LUKS v1 encryption format, compatible with Linux dm-crypt/cryptsetup
Supported options:
key-secret
¶Provides the ID of a secret
object that contains the passphrase.
cipher-alg
¶Name of the cipher algorithm and key length. Currently defaultsto aes-256
.
cipher-mode
¶Name of the encryption mode to use. Currently defaults to xts
.
ivgen-alg
¶Name of the initialization vector generator algorithm. Currently defaultsto plain64
.
ivgen-hash-alg
¶Name of the hash algorithm to use with the initialization vector generator(if required). Defaults to sha256
.
hash-alg
¶Name of the hash algorithm to use for PBKDF algorithmDefaults to sha256
.
iter-time
¶Amount of time, in milliseconds, to use for PBKDF algorithm per key slot.Defaults to 2000
.
vdi
¶VirtualBox 1.1 compatible image format.
Supported options:
static
¶If this option is set to on
, the image is created with metadatapreallocation.
vmdk
¶VMware 3 and 4 compatible image format.
Supported options:
backing_file
¶File name of a base image (see create
subcommand).
compat6
¶Create a VMDK version 6 image (instead of version 4)
hwversion
¶Specify vmdk virtual hardware version. Compat6 flag cannot be enabledif hwversion is specified.
subformat
¶Specifies which VMDK subformat to use. Valid options aremonolithicSparse
(default),monolithicFlat
,twoGbMaxExtentSparse
,twoGbMaxExtentFlat
andstreamOptimized
.
vpc
¶VirtualPC compatible image format (VHD).
Supported options:
subformat
¶Specifies which VHD subformat to use. Valid options aredynamic
(default) and fixed
.
VHDX
¶Hyper-V compatible image format (VHDX).
Supported options:
subformat
¶Specifies which VHDX subformat to use. Valid options aredynamic
(default) and fixed
.
block_state_zero
¶Force use of payload blocks of type ‘ZERO’. Can be set to on
(default)or off
. When set to off
, new blocks will be created asPAYLOAD_BLOCK_NOT_PRESENT
, which means parsers are free to returnarbitrary data for those blocks. Do not set to off
when usingqemu-imgconvert
with subformat=dynamic
.
block_size
¶Block size; min 1 MB, max 256 MB. 0 means auto-calculate based onimage size.
log_size
¶Log size; min 1 MB.
Read-only formats¶
More disk image file formats are supported in a read-only mode.
bochs
¶Bochs images of growing
type.
cloop
¶Linux Compressed Loop image, useful only to reuse directly compressedCD-ROM images present for example in the Knoppix CD-ROMs.
dmg
¶Apple disk image.
parallels
¶Parallels disk image format.
Using host drives¶
In addition to disk image files, QEMU can directly access hostdevices. We describe here the usage for QEMU version >= 0.8.3.
Linux¶
On Linux, you can directly use the host device filename instead of adisk image filename provided you have enough privileges to accessit. For example, use /dev/cdrom
to access to the CDROM.
- CD
- You can specify a CDROM device even if no CDROM is loaded. QEMU hasspecific code to detect CDROM insertion or removal. CDROM ejection bythe guest OS is supported. Currently only data CDs are supported.
- Floppy
- You can specify a floppy device even if no floppy is loaded. Floppyremoval is currently not detected accurately (if you change floppywithout doing floppy access while the floppy is not loaded, the guestOS will think that the same floppy is loaded).Use of the host’s floppy device is deprecated, and support for it willbe removed in a future release.
- Hard disks
- Hard disks can be used. Normally you must specify the whole disk(
/dev/hdb
instead of/dev/hdb1
) so that the guest OS cansee it as a partitioned disk. WARNING: unless you know what you do, itis better to only make READ-ONLY accesses to the hard disk otherwiseyou may corrupt your host data (use the-snapshot
commandline option or modify the device permissions accordingly).
Windows¶
The preferred syntax is the drive letter (e.g. d:
). Thealternate syntax .d:
is supported. /dev/cdrom
issupported as an alias to the first CDROM drive.
Currently there is no specific code to handle removable media, so itis better to use the change
or eject
monitor commands tochange or eject media.
Hard disks can be used with the syntax: .PhysicalDriveN
where N is the drive number (0 is the first hard disk).
WARNING: unless you know what you do, it is better to only makeREAD-ONLY accesses to the hard disk otherwise you may corrupt yourhost data (use the -snapshot
command line so that themodifications are written in a temporary file).
Mac OS X¶
/dev/cdrom
is an alias to the first CDROM.
Currently there is no specific code to handle removable media, so itis better to use the change
or eject
monitor commands tochange or eject media.
Virtual FAT disk images¶
QEMU can automatically create a virtual FAT disk image from adirectory tree. In order to use it, just type:
Then you access access to all the files in the /my_directory
directory without having to copy them in a disk image or to exportthem via SAMBA or NFS. The default access is read-only.
Floppies can be emulated with the :floppy:
option:
A read/write support is available for testing (beta stage) with the:rw:
option:
What you should never do:
- use non-ASCII filenames
- use “-snapshot” together with “:rw:”
- expect it to work when loadvm’ing
- write to the FAT directory on the host system while accessing it with the guest system
NBD access¶
QEMU can access directly to block device exported using the Network Block Deviceprotocol.
If the NBD server is located on the same host, you can use an unix socket insteadof an inet socket:
In this case, the block device must be exported using qemu-nbd:
The use of qemu-nbd allows sharing of a disk between several guests:
and then you can use it with two guests:
If the nbd-server uses named exports (supported since NBD 2.9.18, or with QEMU’sown embedded NBD server), you must specify an export name in the URI:
The URI syntax for NBD is supported since QEMU 1.3. An alternative syntax isalso available. Here are some example of the older syntax:
iSCSI LUNs¶
iSCSI is a popular protocol used to access SCSI devices across a computernetwork.
There are two different ways iSCSI devices can be used by QEMU.
The first method is to mount the iSCSI LUN on the host, and make it appear asany other ordinary SCSI device on the host and then to access this device as a/dev/sd device from QEMU. How to do this differs between host OSes.
The second method involves using the iSCSI initiator that is built intoQEMU. This provides a mechanism that works the same way regardless of whichhost OS you are running QEMU on. This section will describe this second methodof using iSCSI together with QEMU.
In QEMU, iSCSI devices are described using special iSCSI URLs. URL syntax:
Username and password are optional and only used if your target is set upusing CHAP authentication for access control.Alternatively the username and password can also be set via environmentvariables to have these not show up in the process list:
Various session related parameters can be set via special options, eitherin a configuration file provided via ‘-readconfig’ or directly on thecommand line.
If the initiator-name is not specified qemu will use a default nameof ‘iqn.2008-11.org.linux-kvm[:<uuid>’] where <uuid> is the UUID of thevirtual machine. If the UUID is not specified qemu will use‘iqn.2008-11.org.linux-kvm[:<name>’] where <name> is the name of thevirtual machine.
Setting a specific initiator name to use when logging in to the target:
Controlling which type of header digest to negotiate with the target:
These can also be set via a configuration file:
Setting the target name allows different options for different targets:
How to use a configuration file to set iSCSI configuration options:
How to set up a simple iSCSI target on loopback and access it via QEMU:this example shows how to set up an iSCSI target with one CDROM and one DISKusing the Linux STGT software target. This target is available on Red Hat basedsystems as the package ‘scsi-target-utils’.
GlusterFS disk images¶
GlusterFS is a user space distributed file system.
You can boot from the GlusterFS disk image with the command:
URI:
JSON:
gluster is the protocol.
TYPE specifies the transport type used to connect to glustermanagement daemon (glusterd). Valid transport types aretcp and unix. In the URI form, if a transport type isn’t specified,then tcp type is assumed.
HOST specifies the server where the volume file specification forthe given volume resides. This can be either a hostname or an ipv4 address.If transport type is unix, then HOST field should not be specified.Instead socket field needs to be populated with the path to unix domainsocket.
PORT is the port number on which glusterd is listening. This is optionaland if not specified, it defaults to port 24007. If the transport type is unix,then PORT should not be specified.
VOLUME is the name of the gluster volume which contains the disk image.
PATH is the path to the actual disk image that resides on gluster volume.
debug is the logging level of the gluster protocol driver. Debug levelsare 0-9, with 9 being the most verbose, and 0 representing no debugging output.The default level is 4. The current logging levels defined in the gluster sourceare 0 - None, 1 - Emergency, 2 - Alert, 3 - Critical, 4 - Error, 5 - Warning,6 - Notice, 7 - Info, 8 - Debug, 9 - Trace
logfile is a commandline option to mention log file path which helps inlogging to the specified file and also help in persisting the gfapi logs. Thedefault is stderr.
You can create a GlusterFS disk image with the command:
Examples
Secure Shell (ssh) disk images¶
You can access disk images located on a remote ssh serverby using the ssh protocol:
Alternative syntax using properties:
ssh is the protocol.
USER is the remote user. If not specified, then the localusername is tried.
SERVER specifies the remote ssh server. Any ssh server can beused, but it must implement the sftp-server protocol. Most Unix/Linuxsystems should work without requiring any extra configuration.
PORT is the port number on which sshd is listening. By defaultthe standard ssh port (22) is used.
PATH is the path to the disk image.
The optional HOST_KEY_CHECK parameter controls how the remotehost’s key is checked. The default is yes
which means to usethe local .ssh/known_hosts
file. Setting this to no
turns off known-hosts checking. Or you can check that the host keymatches a specific fingerprint:host_key_check=md5:78:45:8e:14:57:4f:d5:45:83:0a:0e:f3:49:82:c9:c8
(sha1:
can also be used as a prefix, but note that OpenSSHtools only use MD5 to print fingerprints).
Currently authentication must be done using ssh-agent. Otherauthentication methods may be supported in future.
Note: Many ssh servers do not support an fsync
-style operation.The ssh driver cannot guarantee that disk flush requests areobeyed, and this causes a risk of disk corruption if the remoteserver or network goes down during writes. The driver willprint a warning when fsync
is not supported:
With sufficiently new versions of libssh and OpenSSH, fsync
issupported.
NVMe disk images¶
NVM Express (NVMe) storage controllers can be accessed directly by a userspacedriver in QEMU. This bypasses the host kernel file system and block layerswhile retaining QEMU block layer functionalities, such as block jobs, I/Othrottling, image formats, etc. Disk I/O performance is typically higher thanwith -drivefile=/dev/sda
using either thread pool or linux-aio.
The controller will be exclusively used by the QEMU process once started. To beable to share storage between multiple VMs and other applications on the host,please use the file based protocols.
Before starting QEMU, bind the host NVMe controller to the host vfio-pcidriver. For example:
Alternative syntax using properties:
HOST:BUS:SLOT.FUNC is the NVMe controller’s PCI deviceaddress on the host.
NAMESPACE is the NVMe namespace number, starting from 1.
Disk image file locking¶
By default, QEMU tries to protect image files from unexpected concurrentaccess, as long as it’s supported by the block protocol driver and hostoperating system. If multiple QEMU processes (including QEMU emulators andutilities) try to open the same image with conflicting accessing modes, all butthe first one will get an error.
This feature is currently supported by the file protocol on Linux with the OpenFile Descriptor (OFD) locking API, and can be configured to fall back to POSIXlocking if the POSIX host doesn’t support Linux OFD locking.
To explicitly enable image locking, specify “locking=on” in the file protocoldriver options. If OFD locking is not possible, a warning will be printed andthe POSIX locking API will be used. In this case there is a risk that the lockwill get silently lost when doing hot plugging and block jobs, due to theshortcomings of the POSIX locking API.
QEMU transparently handles lock handover during shared storage migration. Forshared virtual disk images between multiple VMs, the “share-rw” device optionshould be used.
By default, the guest has exclusive write access to its disk image. If theguest can safely share the disk image with other writers the-device...,share-rw=on
parameter can be used. This is only safe ifthe guest is running software, such as a cluster file system, thatcoordinates disk accesses to avoid corruption.
Note that share-rw=on only declares the guest’s ability to share the disk.Some QEMU features, such as image file formats, require exclusive write accessto the disk image and this is unaffected by the share-rw=on option.
Alternatively, locking can be fully disabled by “locking=off” block deviceoption. In the command line, the option is usually in the form of“file.locking=off” as the protocol driver is normally placed as a “file” childunder a format driver. For example:
To check if image locking is active, check the output of the “lslocks” commandon host and see if there are locks held by the QEMU process on the image file.More than one byte could be locked by the QEMU instance, each byte of whichreflects a particular permission that is acquired or protected by the runningblock driver.
Filter drivers¶
QEMU supports several filter drivers, which don’t store any data, but performsome additional tasks, hooking io requests.
Qemu Mac Download
preallocate
¶The preallocate filter driver is intended to be inserted between formatand protocol nodes and preallocates some additional space(expanding the protocol file) when writing past the file’s end. This can beuseful for file-systems with slow allocation.
Qemu Mac Os X Download
Supported options:
prealloc-align
¶On preallocation, align the file length to this value (in bytes), default 1M.
prealloc-size
¶Qemu Macos Download
How much to preallocate (in bytes), default 128M.