IOMMU Event Tracing - What it is and how it can help your distro?
http://events.linuxfoundation.org/sites/events/files/slides/iommu_tracing_feb23_2015.pdf
IOMMU Event
- IOMMU group class events
. Add device to IOMMU group (add_device_to_group)
. Remove device from IOMMU group (remove_device_from_group)
- IOMMU device class events
. Attach device to a domain
. Detach device from a domain
- IOMMU map/unmap events
. map
. unmap
- IOMMU Error class
. io_page_fault
root@cnode14-m:/sys/kernel/debug/tracing/events/iommu# tree -L 1
.
├── add_device_to_group
├── attach_device_to_domain
├── detach_device_from_domain
├── enable
├── filter
├── io_page_fault
├── map
├── remove_device_from_group
└── unmap
Output Format:
- Whenever device drivers make IOMMU map and unmap request, output looks like:
.for IOMMU map: IOMMU: iova=0x%016llx paddr=0x%016llx size=%zu
.fir IOMMU unmap: IOMMU: iova=0x%016llx size=%zu unmapped_size=%zu
Enable IOMMU tracing at boot-time?
- Using kernel boot option trace_event: (trace_event=iommu)
Output analysis
- When devices is detected, it is added to goup.
root@cnode14-m:/sys/kernel/debug/tracing# cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 68/68 #P:8
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
swapper/0-1 [004] .... 2.052806: add_device_to_group: IOMMU: groupID=0 device=0000:00:00.0
swapper/0-1 [004] .... 2.052822: add_device_to_group: IOMMU: groupID=1 device=0000:00:01.0
swapper/0-1 [004] .... 2.052831: add_device_to_group: IOMMU: groupID=2 device=0000:00:01.1
swapper/0-1 [004] .... 2.052840: add_device_to_group: IOMMU: groupID=3 device=0000:00:02.0
swapper/0-1 [004] .... 2.052849: add_device_to_group: IOMMU: groupID=4 device=0000:00:03.0
swapper/0-1 [004] .... 2.052857: add_device_to_group: IOMMU: groupID=5 device=0000:00:05.0
swapper/0-1 [004] .... 2.052862: add_device_to_group: IOMMU: groupID=5 device=0000:00:05.2
swapper/0-1 [004] .... 2.052867: add_device_to_group: IOMMU: groupID=5 device=0000:00:05.4
swapper/0-1 [004] .... 2.052873: add_device_to_group: IOMMU: groupID=6 device=0000:00:11.0
swapper/0-1 [004] .... 2.052879: add_device_to_group: IOMMU: groupID=7 device=0000:00:16.0
swapper/0-1 [004] .... 2.052887: add_device_to_group: IOMMU: groupID=8 device=0000:00:19.0
swapper/0-1 [004] .... 2.052893: add_device_to_group: IOMMU: groupID=9 device=0000:00:1a.0
...(skip)...
swapper/0-1 [004] .... 2.052956: add_device_to_group: IOMMU: groupID=15 device=0000:03:00.0
swapper/0-1 [004] .... 2.052964: add_device_to_group: IOMMU: groupID=15 device=0000:03:00.1
swapper/0-1 [004] .... 2.052977: add_device_to_group: IOMMU: groupID=16 device=0000:04:00.0
swapper/0-1 [004] .... 2.052988: add_device_to_group: IOMMU: groupID=16 device=0000:04:00.1
When virtual machine requests DMA.
qemu-system-x86-1823 [002] .... 170.207244: map: IOMMU: iova=0x00000000000ae000 paddr=0x000000011020e000 size=0x0
qemu-system-x86-1823 [002] .... 170.207244: map: IOMMU: iova=0x00000000000af000 paddr=0x000000011020f000 size=0x0
qemu-system-x86-1823 [002] .... 170.207244: map: IOMMU: iova=0x00000000000b0000 paddr=0x0000000110210000 size=0x0
Possible IOMMU ops
- but current IOMMU tracing supports, attach_dev, detach_dev, map, unmap, add_device, remove_device only!
static struct iommu_ops intel_iommu_ops = {
.domain_init = intel_iommu_domain_init,
.domain_destroy = intel_iommu_domain_destroy,
.attach_dev = intel_iommu_attach_device,
.detach_dev = intel_iommu_detach_device,
.map = intel_iommu_map,
.unmap = intel_iommu_unmap,
.iova_to_phys = intel_iommu_iova_to_phys,
.domain_has_cap = intel_iommu_domain_has_cap,
.add_device = intel_iommu_add_device,
.remove_device = intel_iommu_remove_device,
.pgsize_bitmap = INTEL_IOMMU_PGSIZES,
};
No DMA ops tracing
- Really big processing is DMA operations like map_page, unmap_page.
- but there is no tracing about dma_ops!
struct dma_map_ops intel_dma_ops = {
.alloc = intel_alloc_coherent,
.free = intel_free_coherent,
.map_sg = intel_map_sg,
.unmap_sg = intel_unmap_sg,
.map_page = intel_map_page,
.unmap_page = intel_unmap_page,
.mapping_error = intel_mapping_error,
};