android emulator虚拟设备分析第二篇之pipe

一、概述

qemu pipe也是一个虚拟设备,是一个通用的虚拟设备,用于提供guest os和emulator通信的功能,类似于一个抽象的通信层,这样就不用写很多虚拟设备了。

之前在guest os中有个qemud进程,也是干这个事的,使用虚拟设备ttyS1提供guest os和emulator通信的功能,速度比较慢,已被pipe所替代。

看本篇之前,必须看完第一篇;看完本篇,然后看第三篇,这两个是结合在一起的,都看完后建议回顾一下本篇。

基于通用的数据通信pipe,emulator提供了四种服务


Available services:
-------------------

  tcp:<port>

     Open a TCP socket to a given localhost port. This provides a very fast
     pass-through that doesn't depend on the very slow internal emulator
     NAT router. Note that you can only use the file descriptor with read()
     and write() though, send() and recv() will return an ENOTSOCK error,
     as well as any socket ioctl().

     For security reasons, it is not possible to connect to non-localhost
     ports.

  unix:<path>

     Open a Unix-domain socket on the host.

  opengles

     Connects to the OpenGL ES emulation process. For now, the implementation
     is equivalent to tcp:22468, but this may change in the future.

  qemud

     Connects to the QEMUD service inside the emulator. This replaces the
     connection that was performed through /dev/ttyS1 in older Android platform
     releases. See $QEMU/docs/ANDROID-QEMUD.TXT for details.
其中qemud又提供了一些子服务比如

"gsm" service
"gps" service

"hw-control" / "control" service
"sensors" service
"boot-properties" service

如何使用qemu_pipe去通信,将在第三篇中以qemud service中的"boot-properties" service为例去介绍,本篇仅关心虚拟设备以及驱动


二、驱动

先看文档

XIV. QEMU Pipe device:
======================

Relevant files:
  $QEMU/hw/android/goldfish/pipe.c
  $KERNEL/drivers/misc/qemupipe/qemu_pipe.c

Device properties:
  Name: qemu_pipe
  Id: -1
  IrqCount: 1
  I/O Registers:
    0x00  COMMAND          W: Write to perform command (see below).
    0x04  STATUS           R: Read status
    0x08  CHANNEL          RW: Read or set current channel id.
    0x0c  SIZE             RW: Read or set current buffer size.
    0x10  ADDRESS          RW: Read or set current buffer physical address.
    0x14  WAKES            R: Read wake flags.
    0x18  PARAMS_ADDR_LOW  RW: Read/set low bytes of parameters block address.
    0x1c  PARAMS_ADDR_HIGH RW: Read/set high bytes of parameters block address.
    0x20  ACCESS_PARAMS    W: Perform access with parameter block.

This is a special device that is totally specific to QEMU, but allows guest
processes to communicate directly with the emulator with extremely high
performance. This is achieved by avoiding any in-kernel memory copies, relying
on the fact that QEMU can access guest memory at runtime (under proper
conditions controlled by the kernel).

Please refer to $QEMU/docs/ANDROID-QEMU-PIPE.TXT for full details on the
device's operations.

1、COMMAND包括CMD_OPEN,CMD_CLOSE,CMD_POLL,CMD_WRITE_BUFFER,CMD_WAKE_ON_WRITE(可写时唤醒),CMD_READ_BUFFER,CMD_WAKE_ON_READ(可读时唤醒)
2、CHANNEL,每次打开/dev/qemu_pipe,都将新建一个struct qemu_pipe* pipe,相当于在/dev/qemu_pipe上面新开了一个通道,通道号CHANNEL=(unsigned long)pipe
3、WAKES,是否应该将读等待/写等待的线程唤醒
4、PARAMS_ADDR_LOW,PARAMS_ADDR_HIGH,ACCESS_PARAMS用于快速读写访问,这个看不懂的话不影响理解qemu_pipe,可以跳过。
struct access_params{
    uint32_t channel;
    uint32_t size;
    uint32_t address;
    uint32_t cmd;
    uint32_t result;
    /* reserved for future extension */
    uint32_t flags;
};
kernel代码中qemu_pipe_dev在probe时,会申请一个access_params结构体,并将它在guest os的内核物理地址写入PARAMS_ADDR_LOW和PARAMS_ADDR_HIGH。
kernel代码在需要进行快速读写访问时,设置access_params结构体的内容,然后使用ACCESS_PARAMS启动快速读写。

emulator代码中虚拟设备将PARAMS_ADDR_LOW和PARAMS_ADDR_HIGH所表示的地址映射到emulator虚拟空间地址中,然后去获取channel, size, address, cmd等数据然后去操作,相同于一次IO访问,得到多个IO数据,所以叫做batch,快速访问。
注意PARAMS_ADDR_LOW和PARAMS_ADDR_HIGH写的是guest os的内核物理地址,access_params结构体里面的buffer还是guest os内核虚拟地址。



驱动程序为goldfish代码中的drivers/misc/qemupipe/qemu_pipe.c

初始化代码为:

static struct platform_driver qemu_pipe = {
    .probe = qemu_pipe_probe,
    .remove = qemu_pipe_remove,
    .driver = {
        .name = "qemu_pipe"
    }
};

static int __init qemu_pipe_dev_init(void)
{
    return platform_driver_register(&qemu_pipe);
}

static void qemu_pipe_dev_exit(void)
{
    platform_driver_unregister(&qemu_pipe);
}


qemu_pipe_probe干的还是那些事,得到IO内存资源,进行ioremap,得到中断号,设置中断函数。最后使用misc_register注册了一个杂项字符设备,设备文件为/dev/qemu_pipe:

static const struct file_operations qemu_pipe_fops = {
    .owner = THIS_MODULE,
    .read = qemu_pipe_read,
    .write = qemu_pipe_write,
    .poll = qemu_pipe_poll,
    .open = qemu_pipe_open,
    .release = qemu_pipe_release,
};

static struct miscdevice qemu_pipe_device = {
    .minor = MISC_DYNAMIC_MINOR,
    .name = "qemu_pipe",
    .fops = &qemu_pipe_fops,
};

static int qemu_pipe_probe(struct platform_device *pdev)
{
    int err;
    struct resource *r;
    struct qemu_pipe_dev *dev = pipe_dev;

    PIPE_D("Creating device\n");

    INIT_RADIX_TREE(&dev->pipes, GFP_ATOMIC);
    /* not thread safe, but this should not happen */
    if (dev->base != NULL) {
        printk(KERN_ERR "QEMU PIPE Device: already mapped at %p\n",
            dev->base);
        return -ENODEV;
    }
    r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
    if (r == NULL || r->end - r->start < PAGE_SIZE - 1) {
        printk(KERN_ERR "QEMU PIPE Device: can't allocate i/o page\n");
        return -EINVAL;
    }
    dev->base = ioremap(r->start, PAGE_SIZE);
    PIPE_D("The mapped IO base is %p\n", dev->base);

    r = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
    if (r == NULL) {
        printk(KERN_ERR "QEMU PIPE Device: failure to allocate IRQ\n");
        err = -EINVAL;
        goto err_alloc_irq;
    }
    dev->irq = r->start;
    PIPE_D("The IRQ is %d\n", dev->irq);
    err = request_irq(dev->irq, qemu_pipe_interrupt, IRQF_SHARED,
                "goldfish_pipe", dev);
    if (err)
        goto err_alloc_irq;

    spin_lock_init(&dev->lock);

    err = misc_register(&qemu_pipe_device);
    if (err)
        goto err_misc_register;

    setup_access_params_addr(dev);
    return 0;

err_misc_register:
    free_irq(dev->irq, pdev);
err_alloc_irq:
    iounmap(dev->base);
    dev->base = NULL;
    return err;
}


qemu_pipe_open,每次打开/dev/qemu_pipe都会alloc一个新的qemu_pipe结构体,每个qemu_pipe结构体对应一个CHANNEL,qemu_pipe结构体将被添加到一个radix_tree中。将qemu_pipe的地址作为CHANNEL(不可能重复的)写入PIPE_REG_CHANNEL寄存器,然后写CMD_OPEN到PIPE_REG_COMMAND中,去打开新的CHANNEL。最后设置了filp的私有变量为qemu_pipe结构体。

static int qemu_pipe_open(struct inode *inode, struct file *file)
{
    unsigned long irq_flags;
    struct qemu_pipe *pipe;
    struct qemu_pipe_dev *dev = pipe_dev;
    int32_t status;
    int ret;

    /* Allocate new pipe kernel object */
    pipe = kzalloc(sizeof(*pipe), GFP_KERNEL);
    if (pipe == NULL) {
        PIPE_E("Not enough kernel memory to allocate new pipe\n");
        return -ENOMEM;
    }

    PIPE_D("Opening pipe %p\n", pipe);

    pipe->dev = dev;
    mutex_init(&pipe->lock);
    init_waitqueue_head(&pipe->wake_queue);

    /* Now, tell the emulator we're opening a new pipe. We use the
    * pipe object's address as the channel identifier for simplicity.
    */
    spin_lock_irqsave(&dev->lock, irq_flags);
    if ((ret = radix_tree_insert(&dev->pipes, (unsigned long)pipe, pipe))) {
        spin_unlock_irqrestore(&dev->lock, irq_flags);
        PIPE_E("opening pipe failed due to radix tree insertion failure\n");
        kfree(pipe);
        return ret;
    }
    writel((unsigned long)pipe, dev->base + PIPE_REG_CHANNEL);
    writel(CMD_OPEN, dev->base + PIPE_REG_COMMAND);
    status = readl(dev->base + PIPE_REG_STATUS);
    spin_unlock_irqrestore(&dev->lock, irq_flags);

    if (status < 0) {
        PIPE_E("Could not open pipe channel, error=%d\n", status);
        kfree(pipe);
        return status;
    }

    /* All is done, save the pipe into the file's private data field */
    file->private_data = pipe;
    return 0;
}


qemu_pipe_read和qemu_pipe_write都是使用qemu_pipe_read_write来实现的,注意access_ok和__get_user/__put_user对于用户空间指针的检测。具体的读写比较简单,就是操作IO寄存器而已,需要注意的是,如果是非阻塞方式,需要进行阻塞等待。
具体的方法就是往PIPE_REG_COMMAND里面写CMD_WAKE_ON_WRITE或者CMD_WAKE_ON_READ,然后调用wait_event_interruptible去等待!test_bit(wakeBit, &pipe->flags)。
当中断来临时,会检查每一个CHANNEL的PIPE_REG_WAKES寄存器,如果可读 or 可写 or 已关闭,中断函数中会清除pipe->flags中的对应的等待标志位,然后wait_event_interruptible等待结束。如果是qemu_pipe被关闭的情况,wait_event_interruptible等待结束之后,检查到错误状态并退出。

/* This function is used for both reading from and writing to a given
 * pipe.
 */
static ssize_t qemu_pipe_read_write(struct file *filp, char __user *buffer,
                    size_t bufflen, int is_write)
{
    unsigned long irq_flags;
    struct qemu_pipe *pipe = filp->private_data;
    struct qemu_pipe_dev *dev = pipe->dev;
    const int cmd_offset = is_write ? 0
                    : (CMD_READ_BUFFER - CMD_WRITE_BUFFER);
    unsigned long address, address_end;
    int ret = 0;

    /* If the emulator already closed the pipe, no need to go further */
    if (test_bit(BIT_CLOSED_ON_HOST, &pipe->flags)) {
        PIPE_W("(write=%d) already closed!\n", is_write);
        ret = -EIO;
        goto out;
    }

    /* Null reads or writes succeeds */
    if (unlikely(bufflen) == 0)
        goto out;

    /* Check the buffer range for access */
    if (!access_ok(is_write ? VERIFY_WRITE : VERIFY_READ,
            buffer, bufflen)) {
        ret = -EFAULT;
        PIPE_W("rw access_ok failed\n");
        goto out;
    }

    /* Serialize access to the pipe */
    if (mutex_lock_interruptible(&pipe->lock)) {
        PIPE_W("(write=%d) interrupted!\n", is_write);
        return -ERESTARTSYS;
    }

    address = (unsigned long)(void *)buffer;
    address_end = address + bufflen;

    while (address < address_end) {
        unsigned long  page_end = (address & PAGE_MASK) + PAGE_SIZE;
        unsigned long  next     = page_end < address_end ? page_end
                                 : address_end;
        unsigned long  avail    = next - address;
        int status, wakeBit;

        /* Ensure that the corresponding page is properly mapped */
        if (is_write) {
            char c;
            /* Ensure that the page is mapped and readable */
            if (__get_user(c, (char __user *)address)) {
                PIPE_E("read fault at address 0x%08x\n",
                    (unsigned int)address);
                if (!ret)
                    ret = -EFAULT;
                break;
            }
        } else {
            /* Ensure that the page is mapped and writable */
            if (__put_user(0, (char __user *)address)) {
                PIPE_E("write fault at address 0x%08x\n",
                    (unsigned int)address);
                if (!ret)
                    ret = -EFAULT;
                break;
            }
        }

        /* Now, try to transfer the bytes in the current page */
        spin_lock_irqsave(&dev->lock, irq_flags);
        if (dev->aps == NULL || access_with_param(
            dev, CMD_WRITE_BUFFER + cmd_offset, address, avail,
            pipe, &status) < 0)
        {
            writel((unsigned long)pipe,
                dev->base + PIPE_REG_CHANNEL);
            writel(avail, dev->base + PIPE_REG_SIZE);
            writel(address, dev->base + PIPE_REG_ADDRESS);
            writel(CMD_WRITE_BUFFER + cmd_offset,
                dev->base + PIPE_REG_COMMAND);
            status = readl(dev->base + PIPE_REG_STATUS);
        }
        spin_unlock_irqrestore(&dev->lock, irq_flags);

        if (status > 0) { /* Correct transfer */
            ret += status;
            address += status;
            continue;
        }

        if (status == 0)  /* EOF */
            break;

        /* An error occured. If we already transfered stuff, just
        * return with its count. We expect the next call to return
        * an error code */
        if (ret > 0)
            break;

        /* If the error is not PIPE_ERROR_AGAIN, or if we are not in
        * non-blocking mode, just return the error code.
        */
        if (status != PIPE_ERROR_AGAIN ||
            (filp->f_flags & O_NONBLOCK) != 0) {
            ret = qemu_pipe_error_convert(status);
            break;
        }

        /* We will have to wait until more data/space is available.
        * First, mark the pipe as waiting for a specific wake signal.
        */
        wakeBit = is_write ? BIT_WAKE_ON_WRITE : BIT_WAKE_ON_READ;
        set_bit(wakeBit, &pipe->flags);

        /* Tell the emulator we're going to wait for a wake event */
        spin_lock_irqsave(&dev->lock, irq_flags);
        writel((unsigned long)pipe, dev->base + PIPE_REG_CHANNEL);
        writel(CMD_WAKE_ON_WRITE + cmd_offset,
            dev->base + PIPE_REG_COMMAND);
        spin_unlock_irqrestore(&dev->lock, irq_flags);

        /* Unlock the pipe, then wait for the wake signal */
        mutex_unlock(&pipe->lock);

        while (test_bit(wakeBit, &pipe->flags)) {
            if (wait_event_interruptible(
                    pipe->wake_queue,
                    !test_bit(wakeBit, &pipe->flags))) {
                ret = -ERESTARTSYS;
                PIPE_W("rw, wait_event error\n");
                goto out;
            }

            if (test_bit(BIT_CLOSED_ON_HOST, &pipe->flags)) {
                ret = -EIO;
                PIPE_W("rw, pipe already closed\n");
                goto out;
            }
        }

        /* Try to re-acquire the lock */
        if (mutex_lock_interruptible(&pipe->lock)) {
            ret = -ERESTARTSYS;
            goto out;
        }

        /* Try the transfer again */
        continue;
    }
    mutex_unlock(&pipe->lock);
out:
    return ret;
}

static ssize_t qemu_pipe_read(struct file *filp, char __user *buffer,
                  size_t bufflen, loff_t *ppos)
{
    return qemu_pipe_read_write(filp, buffer, bufflen, 0);
}

static ssize_t qemu_pipe_write(struct file *filp,
                const char __user *buffer, size_t bufflen,
                loff_t *ppos)
{
    return qemu_pipe_read_write(filp, (char __user *)buffer, bufflen, 1);
}


qemu_pipe_poll,实现poll,select,epoll接口用的,没什么特殊的,标准实现方式

static unsigned int qemu_pipe_poll(struct file *filp, poll_table *wait)
{
    struct qemu_pipe *pipe = filp->private_data;
    struct qemu_pipe_dev *dev = pipe->dev;
    unsigned long irq_flags;
    unsigned int mask = 0;
    int status;

    mutex_lock(&pipe->lock);

    poll_wait(filp, &pipe->wake_queue, wait);

    spin_lock_irqsave(&dev->lock, irq_flags);
    writel((unsigned long)pipe, dev->base + PIPE_REG_CHANNEL);
    writel(CMD_POLL, dev->base + PIPE_REG_COMMAND);
    status = readl(dev->base + PIPE_REG_STATUS);
    spin_unlock_irqrestore(&dev->lock, irq_flags);

    mutex_unlock(&pipe->lock);

    if (status & PIPE_POLL_IN)
        mask |= POLLIN | POLLRDNORM;

    if (status & PIPE_POLL_OUT)
        mask |= POLLOUT | POLLWRNORM;

    if (status & PIPE_POLL_HUP)
        mask |= POLLHUP;

    if (test_bit(BIT_CLOSED_ON_HOST, &pipe->flags))
        mask |= POLLERR;

    return mask;
}


qemu_pipe_interrupt,中断处理函数,循环处理每一个qemu_pipe,看看是否可读 or 可写 or 关闭了,然后唤醒对应的线程

static irqreturn_t qemu_pipe_interrupt(int irq, void *dev_id)
{
    struct qemu_pipe_dev *dev = dev_id;
    unsigned long irq_flags;
    int count = 0;

    /* We're going to read from the emulator a list of (channel,flags)
    * pairs corresponding to the wake events that occured on each
    * blocked pipe (i.e. channel).
    */
    spin_lock_irqsave(&dev->lock, irq_flags);
    for (;;) {
        /* First read the channel, 0 means the end of the list */
        struct qemu_pipe *pipe;
        unsigned long wakes;
        unsigned long channel = readl(dev->base + PIPE_REG_CHANNEL);

        if (channel == 0)
            break;

        /* Convert channel to struct pipe pointer + read wake flags */
        wakes = readl(dev->base + PIPE_REG_WAKES);
        pipe  = (struct qemu_pipe *)(ptrdiff_t)channel;

        /* check if pipe is still valid */
        if ((pipe = radix_tree_lookup(&dev->pipes,
            (unsigned long)pipe)) == NULL) {
            PIPE_W("interrupt for already closed pipe\n");
            break;
        }
        /* Did the emulator just closed a pipe? */
        if (wakes & PIPE_WAKE_CLOSED) {
            set_bit(BIT_CLOSED_ON_HOST, &pipe->flags);
            wakes |= PIPE_WAKE_READ | PIPE_WAKE_WRITE;
        }
        if (wakes & PIPE_WAKE_READ)
            clear_bit(BIT_WAKE_ON_READ, &pipe->flags);
        if (wakes & PIPE_WAKE_WRITE)
            clear_bit(BIT_WAKE_ON_WRITE, &pipe->flags);

        wake_up_interruptible(&pipe->wake_queue);
        count++;
    }
    spin_unlock_irqrestore(&dev->lock, irq_flags);

    return (count == 0) ? IRQ_NONE : IRQ_HANDLED;
}


setup_access_params_addr和access_with_param用于快速读写的,看不懂的可以跳过:

/* 0 on success */
static int setup_access_params_addr(struct qemu_pipe_dev *dev)
{
    uint64_t paddr;
    struct access_params *aps;

    aps = kmalloc(sizeof(struct access_params), GFP_KERNEL);
    if (!aps)
        return -1;

    paddr = __pa(aps);
    writel((uint32_t)(paddr >> 32), dev->base + PIPE_REG_PARAMS_ADDR_HIGH);
    writel((uint32_t)paddr, dev->base + PIPE_REG_PARAMS_ADDR_LOW);

    if (!valid_batchbuffer_addr(dev, aps))
        return -1;

    dev->aps = aps;
    return 0;
}

/* A value that will not be set by qemu emulator */
#define IMPOSSIBLE_BATCH_RESULT (0xdeadbeaf)

static int access_with_param(struct qemu_pipe_dev *dev, const int cmd,
                 unsigned long address, unsigned long avail,
                 struct qemu_pipe *pipe, int *status)
{
    struct access_params *aps = dev->aps;

    aps->result = IMPOSSIBLE_BATCH_RESULT;
    aps->channel = (unsigned long)pipe;
    aps->size = avail;
    aps->address = address;
    aps->cmd = cmd;
    writel(cmd, dev->base + PIPE_REG_ACCESS_PARAMS);

    /* If aps->result unchanged, then batch command failed */
    if (aps->result == IMPOSSIBLE_BATCH_RESULT)
        return -1;

    *status = aps->result;
    return 0;
}


另外需要说明的是几种不同的地址:
1、guest os进程虚拟地址,用户空间的地址,内核想使用这种地址时,需要调用copy_from_user与copy_to_user去验证是否正确然后才能读写
2、guest os内核虚拟地址,3GB~4GB
3、guest os内核物理地址,经典情况下,就是内核虚拟地址减去一个偏移量(3GB),物理内存较大时,情况不同。在qemu中通过safe_get_phys_page_debug可以把guest os内核虚拟地址转为guest os内核物理地址
4、emulator所在虚拟空间地址,我们的host os中的用户空间地址,qemu可以操作的内存地址。guest os内核物理地址通过cpu_physical_memory_map后可以map为qemu所在的虚拟空间的地址,然后qemu可以去使用内核传递过来的内存。

三、虚拟设备

pipe虚拟设备的代码为:http://androidxref.com/5.1.0_r1/xref/external/qemu/hw/android/goldfish/pipe.c


初始化代码为pipe_dev_init,没啥好说的,比battery的简单多了。最后有三个调试用的东西,可以不看:

/* initialize the trace device */
void pipe_dev_init(bool newDeviceNaming)
{
    PipeDevice *s;

    s = (PipeDevice *) g_malloc0(sizeof(*s));

    s->dev.name = newDeviceNaming ? "goldfish_pipe" : "qemu_pipe";
    s->dev.id = -1;
    s->dev.base = 0;       // will be allocated dynamically
    s->dev.size = 0x2000;
    s->dev.irq = 0;
    s->dev.irq_count = 1;

    goldfish_device_add(&s->dev, pipe_dev_readfn, pipe_dev_writefn, s);

    register_savevm(NULL,
                    "goldfish_pipe",
                    0,
                    GOLDFISH_PIPE_SAVE_VERSION,
                    goldfish_pipe_save,
                    goldfish_pipe_load,
                    s);

#if DEBUG_ZERO_PIPE
    goldfish_pipe_add_type("zero", NULL, &zeroPipe_funcs);
#endif
#if DEBUG_PINGPONG_PIPE
    goldfish_pipe_add_type("pingpong", NULL, &pingPongPipe_funcs);
#endif
#if DEBUG_THROTTLE_PIPE
    goldfish_pipe_add_type("throttle", NULL, &throttlePipe_funcs);
#endif
}


读函数为pipe_dev_read,需要注意的是PIPE_REG_CHANNEL。

kernel中的中断处理函数每次读取PIPE_REG_CHANNEL时,模拟设备都会将dev->signaled_pipes链表上的一个CHANNEL返回,并设置PIPE_REG_WAKES寄存器,告知kernel中pipe的驱动程序可以唤醒哪一个CHANNEL上的读等待 or 写等待的线程。

dev->signaled_pipes时满足条件,等待被唤醒的pipe列表,里面的节点是在goldfish_pipe_wake函数中添加的。

当dev->signaled_pipes为NULL时,通过goldfish_device_set_irq(&dev->dev, 0, 0)清除中断请求位。

/* I/O read */
static uint32_t pipe_dev_read(void *opaque, hwaddr offset)
{
    PipeDevice *dev = (PipeDevice *)opaque;

    switch (offset) {
    case PIPE_REG_STATUS:
        DR("%s: REG_STATUS status=%d (0x%x)", __FUNCTION__, dev->status, dev->status);
        return dev->status;

    case PIPE_REG_CHANNEL:
        if (dev->signaled_pipes != NULL) {
            Pipe* pipe = dev->signaled_pipes;
            DR("%s: channel=0x%llx wanted=%d", __FUNCTION__,
               (unsigned long long)pipe->channel, pipe->wanted);
            dev->wakes = pipe->wanted;
            pipe->wanted = 0;
            dev->signaled_pipes = pipe->next_waked;
            pipe->next_waked = NULL;
            if (dev->signaled_pipes == NULL) {
                goldfish_device_set_irq(&dev->dev, 0, 0);
                DD("%s: lowering IRQ", __FUNCTION__);
            }
            return (uint32_t)(pipe->channel & 0xFFFFFFFFUL);
        }
        DR("%s: no signaled channels", __FUNCTION__);
        return 0;

    case PIPE_REG_CHANNEL_HIGH:
        if (dev->signaled_pipes != NULL) {
            Pipe* pipe = dev->signaled_pipes;
            DR("%s: channel_high=0x%llx wanted=%d", __FUNCTION__,
               (unsigned long long)pipe->channel, pipe->wanted);
            return (uint32_t)(pipe->channel >> 32);
        }
        DR("%s: no signaled channels", __FUNCTION__);
        return 0;

    case PIPE_REG_WAKES:
        DR("%s: wakes %d", __FUNCTION__, dev->wakes);
        return dev->wakes;

    case PIPE_REG_PARAMS_ADDR_HIGH:
        return (uint32_t)(dev->params_addr >> 32);

    case PIPE_REG_PARAMS_ADDR_LOW:
        return (uint32_t)(dev->params_addr & 0xFFFFFFFFUL);

    default:
        D("%s: offset=%d (0x%x)\n", __FUNCTION__, offset, offset);
    }
    return 0;
}


写函数为pipe_dev_write,如果是写PIPE_REG_COMMAND,有专门的子函数pipeDevice_doCommand处理,如果是写PIPE_REG_ACCESS_PARAMS,相当于batch操作,传递了多个寄存器的值,然后去执行读写操作。

static void pipe_dev_write(void *opaque, hwaddr offset, uint32_t value)
{
    PipeDevice *s = (PipeDevice *)opaque;

    switch (offset) {
    case PIPE_REG_COMMAND:
        DR("%s: command=%d (0x%x)", __FUNCTION__, value, value);
        pipeDevice_doCommand(s, value);
        break;

    case PIPE_REG_SIZE:
        DR("%s: size=%d (0x%x)", __FUNCTION__, value, value);
        s->size = value;
        break;

    case PIPE_REG_ADDRESS:
        DR("%s: address=%d (0x%x)", __FUNCTION__, value, value);
        uint64_set_low(&s->address, value);
        break;

    case PIPE_REG_ADDRESS_HIGH:
        DR("%s: address_high=%d (0x%x)", __FUNCTION__, value, value);
        uint64_set_high(&s->address, value);
        break;

    case PIPE_REG_CHANNEL:
        DR("%s: channel=%d (0x%x)", __FUNCTION__, value, value);
        uint64_set_low(&s->channel, value);
        break;

    case PIPE_REG_CHANNEL_HIGH:
        DR("%s: channel_high=%d (0x%x)", __FUNCTION__, value, value);
        uint64_set_high(&s->channel, value);
        break;

    case PIPE_REG_PARAMS_ADDR_HIGH:
        s->params_addr = (s->params_addr & ~(0xFFFFFFFFULL << 32) ) |
                          ((uint64_t)value << 32);
        break;

    case PIPE_REG_PARAMS_ADDR_LOW:
        s->params_addr = (s->params_addr & ~(0xFFFFFFFFULL) ) | value;
        break;

    case PIPE_REG_ACCESS_PARAMS:
    {
        struct access_params aps;
        struct access_params_64 aps64;
        uint32_t cmd;

        /* Don't touch aps.result if anything wrong */
        if (s->params_addr == 0)
            break;

        if (goldfish_guest_is_64bit()) {
            cpu_physical_memory_read(s->params_addr, (void*)&aps64,
                                     sizeof(aps64));
        } else {
            cpu_physical_memory_read(s->params_addr, (void*)&aps,
                                     sizeof(aps));
        }
        /* sync pipe device state from batch buffer */
        if (goldfish_guest_is_64bit()) {
            s->channel = aps64.channel;
            s->size = aps64.size;
            s->address = aps64.address;
            cmd = aps64.cmd;
        } else {
            s->channel = aps.channel;
            s->size = aps.size;
            s->address = aps.address;
            cmd = aps.cmd;
        }
        if ((cmd != PIPE_CMD_READ_BUFFER) && (cmd != PIPE_CMD_WRITE_BUFFER))
            break;

        pipeDevice_doCommand(s, cmd);
        if (goldfish_guest_is_64bit()) {
            aps64.result = s->status;
            cpu_physical_memory_write(s->params_addr, (void*)&aps64,
                                      sizeof(aps64));
        } else {
            aps.result = s->status;
            cpu_physical_memory_write(s->params_addr, (void*)&aps,
                                      sizeof(aps));
        }
    }
    break;

    default:
        D("%s: offset=%d (0x%x) value=%d (0x%x)\n", __FUNCTION__, offset,
            offset, value, value);
        break;
    }
}


pipeDevice_doCommand,打开,关闭,读,写,可读时唤醒,可写时唤醒。
需要注意的是:

1、在刚打开CHANNEL时,pipe->funcs函数指针指向pipeConnector_funcs,根据guest os第一次写入到/dev/qemu_pipe的内容,得到pipe service的名字以及args。

之后,pipe->funcs都将指向对应的pipe service中实现的函数

2、使用safe_get_phys_page_debug将传递过来的guest os内核虚拟地址转为guest os内核物理地址,然后使用qemu_get_ram_ptr转为emulator进程的虚拟空间地址。

static void
pipeDevice_doCommand( PipeDevice* dev, uint32_t command )
{
    Pipe** lookup = pipe_list_findp_channel(&dev->pipes, dev->channel);
    Pipe*  pipe   = *lookup;
    CPUOldState* env = cpu_single_env;

    /* Check that we're referring a known pipe channel */
    if (command != PIPE_CMD_OPEN && pipe == NULL) {
        dev->status = PIPE_ERROR_INVAL;
        return;
    }

    /* If the pipe is closed by the host, return an error */
    if (pipe != NULL && pipe->closed && command != PIPE_CMD_CLOSE) {
        dev->status = PIPE_ERROR_IO;
        return;
    }

    switch (command) {
    case PIPE_CMD_OPEN:
        DD("%s: CMD_OPEN channel=0x%llx", __FUNCTION__, (unsigned long long)dev->channel);
        if (pipe != NULL) {
            dev->status = PIPE_ERROR_INVAL;
            break;
        }
        pipe = pipe_new(dev->channel, dev);
        pipe->next = dev->pipes;
        dev->pipes = pipe;
        dev->status = 0;
        break;

    case PIPE_CMD_CLOSE:
        DD("%s: CMD_CLOSE channel=0x%llx", __FUNCTION__, (unsigned long long)dev->channel);
        /* Remove from device's lists */
        *lookup = pipe->next;
        pipe->next = NULL;
        pipe_list_remove_waked(&dev->signaled_pipes, pipe);
        pipe_free(pipe);
        break;

    case PIPE_CMD_POLL:
        dev->status = pipe->funcs->poll(pipe->opaque);
        DD("%s: CMD_POLL > status=%d", __FUNCTION__, dev->status);
        break;

    case PIPE_CMD_READ_BUFFER: {
        /* Translate virtual address into physical one, into emulator memory. */
        GoldfishPipeBuffer  buffer;
        target_ulong        address = dev->address;
        target_ulong        page    = address & TARGET_PAGE_MASK;
        hwaddr  phys;
        phys = safe_get_phys_page_debug(ENV_GET_CPU(env), page);
#ifdef TARGET_X86_64
        phys = phys & TARGET_PTE_MASK;
#endif
        buffer.data = qemu_get_ram_ptr(phys) + (address - page);
        buffer.size = dev->size;
        dev->status = pipe->funcs->recvBuffers(pipe->opaque, &buffer, 1);
        DD("%s: CMD_READ_BUFFER channel=0x%llx address=0x%16llx size=%d > status=%d",
           __FUNCTION__, (unsigned long long)dev->channel, (unsigned long long)dev->address,
           dev->size, dev->status);
        break;
    }

    case PIPE_CMD_WRITE_BUFFER: {
        /* Translate virtual address into physical one, into emulator memory. */
        GoldfishPipeBuffer  buffer;
        target_ulong        address = dev->address;
        target_ulong        page    = address & TARGET_PAGE_MASK;
        hwaddr  phys;
        phys = safe_get_phys_page_debug(ENV_GET_CPU(env), page);
#ifdef TARGET_X86_64
        phys = phys & TARGET_PTE_MASK;
#endif
        buffer.data = qemu_get_ram_ptr(phys) + (address - page);
        buffer.size = dev->size;
        dev->status = pipe->funcs->sendBuffers(pipe->opaque, &buffer, 1);
        DD("%s: CMD_WRITE_BUFFER channel=0x%llx address=0x%16llx size=%d > status=%d",
           __FUNCTION__, (unsigned long long)dev->channel, (unsigned long long)dev->address,
           dev->size, dev->status);
        break;
    }

    case PIPE_CMD_WAKE_ON_READ:
        DD("%s: CMD_WAKE_ON_READ channel=0x%llx", __FUNCTION__, (unsigned long long)dev->channel);
        if ((pipe->wanted & PIPE_WAKE_READ) == 0) {
            pipe->wanted |= PIPE_WAKE_READ;
            pipe->funcs->wakeOn(pipe->opaque, pipe->wanted);
        }
        dev->status = 0;
        break;

    case PIPE_CMD_WAKE_ON_WRITE:
        DD("%s: CMD_WAKE_ON_WRITE channel=0x%llx", __FUNCTION__, (unsigned long long)dev->channel);
        if ((pipe->wanted & PIPE_WAKE_WRITE) == 0) {
            pipe->wanted |= PIPE_WAKE_WRITE;
            pipe->funcs->wakeOn(pipe->opaque, pipe->wanted);
        }
        dev->status = 0;
        break;

    default:
        D("%s: command=%d (0x%x)\n", __FUNCTION__, command, command);
    }
}


pipeDevice_doCommand中提到的pipeConnector_funcs函数数组,只有一个pipeConnector_sendBuffers有效,其他都是空壳

pipeConnector_sendBuffers用于guest os第一次往/dev/qemu_pipe中写数据,数据内容为pipe:<service name>:<args>,去寻找匹配的pipe service,然后调用其初始化函数,得到peer(第三篇中的QemudPipe,也是pipe service funcs中的参数opaque),然后设置pipe->funcs指向pipe service提供的funcs。

static int
pipeConnector_sendBuffers( void* opaque, const GoldfishPipeBuffer* buffers, int numBuffers )
{
    PipeConnector* pcon = opaque;
    const GoldfishPipeBuffer*  buffers_limit = buffers + numBuffers;
    int ret = 0;

    DD("%s: channel=0x%llx numBuffers=%d", __FUNCTION__,
       (unsigned long long)pcon->pipe->channel,
       numBuffers);

    while (buffers < buffers_limit) {
        int  avail;

        DD("%s: buffer data (%3d bytes): '%.*s'", __FUNCTION__,
           buffers[0].size, buffers[0].size, buffers[0].data);

        if (buffers[0].size == 0) {
            buffers++;
            continue;
        }

        avail = sizeof(pcon->buffer) - pcon->buffpos;
        if (avail > buffers[0].size)
            avail = buffers[0].size;

        if (avail > 0) {
            memcpy(pcon->buffer + pcon->buffpos, buffers[0].data, avail);
            pcon->buffpos += avail;
            ret += avail;
        }
        buffers++;
    }

    /* Now check that our buffer contains a zero-terminated string */
    if (memchr(pcon->buffer, '\0', pcon->buffpos) != NULL) {
        /* Acceptable formats for the connection string are:
         *
         *   pipe:<name>
         *   pipe:<name>:<arguments>
         */
        char* pipeName;
        char* pipeArgs;

        D("%s: connector: '%s'", __FUNCTION__, pcon->buffer);

        if (memcmp(pcon->buffer, "pipe:", 5) != 0) {
            /* Nope, we don't handle these for now. */
            D("%s: Unknown pipe connection: '%s'", __FUNCTION__, pcon->buffer);
            return PIPE_ERROR_INVAL;
        }

        pipeName = pcon->buffer + 5;
        pipeArgs = strchr(pipeName, ':');

        if (pipeArgs != NULL) {
            *pipeArgs++ = '\0';
            if (!*pipeArgs)
                pipeArgs = NULL;
        }

        Pipe* pipe = pcon->pipe;
        const PipeService* svc = goldfish_pipe_find_type(pipeName);
        if (svc == NULL) {
            D("%s: Unknown server!", __FUNCTION__);
            return PIPE_ERROR_INVAL;
        }

        void*  peer = svc->funcs.init(pipe, svc->opaque, pipeArgs);
        if (peer == NULL) {
            D("%s: Initialization failed!", __FUNCTION__);
            return PIPE_ERROR_INVAL;
        }

        /* Do the evil switch now */
        pipe->opaque = peer;
        pipe->service = svc;
        pipe->funcs  = &svc->funcs;
        pipe->args   = ASTRDUP(pipeArgs);
        AFREE(pcon);
    }

    return ret;
}



goldfish_pipe_add_type用于注册pipe service

void
goldfish_pipe_add_type(const char*               pipeName,
                       void*                     pipeOpaque,
                       const GoldfishPipeFuncs*  pipeFuncs )
{
    PipeServices* list = _pipeServices;
    int           count = list->count;

    if (count >= MAX_PIPE_SERVICES) {
        APANIC("Too many goldfish pipe services (%d)", count);
    }

    if (strlen(pipeName) > MAX_PIPE_SERVICE_NAME_SIZE) {
        APANIC("Pipe service name too long: '%s'", pipeName);
    }

    list->services[count].name   = pipeName;
    list->services[count].opaque = pipeOpaque;
    list->services[count].funcs  = pipeFuncs[0];

    list->count++;
}


goldfish_pipe_find_type用于按pipe service name查找pipe service

static const PipeService*
goldfish_pipe_find_type(const char*  pipeName)
{
    PipeServices* list = _pipeServices;
    int           count = list->count;
    int           nn;

    for (nn = 0; nn < count; nn++) {
        if (!strcmp(list->services[nn].name, pipeName)) {
            return &list->services[nn];
        }
    }
    return NULL;
}


pipe_list_findp_channel,pipe_list_findp_waked,pipe_list_remove_waked是一些链表操作

static Pipe**
pipe_list_findp_channel( Pipe** list, uint64_t channel )
{
    Pipe** pnode = list;
    for (;;) {
        Pipe* node = *pnode;
        if (node == NULL || node->channel == channel) {
            break;
        }
        pnode = &node->next;
    }
    return pnode;
}

static Pipe**
pipe_list_findp_waked( Pipe** list, Pipe* pipe )
{
    Pipe** pnode = list;
    for (;;) {
        Pipe* node = *pnode;
        if (node == NULL || node == pipe) {
            break;
        }
        pnode = &node->next_waked;
    }
    return pnode;
}


static void
pipe_list_remove_waked( Pipe** list, Pipe*  pipe )
{
    Pipe** lookup = pipe_list_findp_waked(list, pipe);
    Pipe*  node   = *lookup;

    if (node != NULL) {
        (*lookup) = node->next_waked;
        node->next_waked = NULL;
    }
}


goldfish_pipe_wake主要是在具体的pipe service中使用的,当pipe service可以接收数据或者被写入数据时,去唤醒等待的线程

void
goldfish_pipe_wake( void* hwpipe, unsigned flags )
{
    Pipe*  pipe = hwpipe;
    Pipe** lookup;
    PipeDevice*  dev = pipe->device;

    DD("%s: channel=0x%llx flags=%d", __FUNCTION__, (unsigned long long)pipe->channel, flags);

    /* If not already there, add to the list of signaled pipes */
    lookup = pipe_list_findp_waked(&dev->signaled_pipes, pipe);
    if (!*lookup) {
        pipe->next_waked = dev->signaled_pipes;
        dev->signaled_pipes = pipe;
    }
    pipe->wanted |= (unsigned)flags;

    /* Raise IRQ to indicate there are items on our list ! */
    goldfish_device_set_irq(&dev->dev, 0, 1);
    DD("%s: raising IRQ", __FUNCTION__);
}


goldfish_pipe_close关闭时,需要唤醒等待的线程

void
goldfish_pipe_close( void* hwpipe )
{
    Pipe* pipe = hwpipe;

    D("%s: channel=0x%llx (closed=%d)", __FUNCTION__, (unsigned long long)pipe->channel, pipe->closed

    if (!pipe->closed) {
        pipe->closed = 1;
        goldfish_pipe_wake( hwpipe, PIPE_WAKE_CLOSED );
    }
}



上一篇:android emulator虚拟设备分析第三篇之pipe上的qemud service


下一篇:android emulator虚拟设备分析第四篇之framebuffer