编辑
2025-04-27
深度学习
00

目录

数据实操

论文:https://arxiv.org/pdf/2407.17490 https://github.com/YuxiangChai/AMEX-codebase/tree/main/data_utils https://huggingface.co/datasets/Yuxiang007/AMEX

image.png

AMEX数据集包括三个层次的注释:

  1. GUI交互元素定位 :
  • 分类两类交互元素:可点击元素和可滚动元素。
  1. GUI屏幕和元素功能描述 :
  • 使用GPT生成屏幕和元素的功能描述,并进行人工检查以确保准确性。
  1. 复杂自然语言指令与GUI动作链 :
  • 每条指令平均包含12.8个步骤的动作链,显著高于现有数据集。

数据实操

下载数据后,合并并解压数据:

zip --fix screenshot.zip --out screenshot_merged.zip unzip screenshot_merged.zip # 解压 # 或者 sudo apt install -y p7zip-full 7z x -mmt=on screenshot.zip # 启用多线程解压

AMEX数据集包括三个层次的注释:

  1. GUI交互元素定位 :
  • 分类两类交互元素:可点击元素和可滚动元素。
  1. GUI屏幕和元素功能描述 :
  • 使用GPT生成屏幕和元素的功能描述,并进行人工检查以确保准确性。
  1. 复杂自然语言指令与GUI动作链 :
  • 每条指令平均包含12.8个步骤的动作链,显著高于现有数据集。

下载数据后,合并并解压数据: zip --fix screenshot.zip --out screenshot_merged.zip unzip screenshot_merged.zip 数据统计

Total JSON files processed: 3046 Total dictionaries (steps) found: 38709

Number of dictionaries per action type:

PRESS_BACK: 135 PRESS_ENTER: 651 PRESS_HOME: 13 SWIPE: 7628 TAP: 24815 TASK_COMPLETE: 2828 TASK_IMPOSSIBLE: 220 TYPE: 2419

样本示例

总共有 8 种不同的动作类型:

  1. SWIPE - 滑动操作,包含方向(UP/DOWN/LEFT/RIGHT)
  2. TAP - 点击操作,包含坐标位置
  3. TYPE - 输入文本操作
  4. PRESS_BACK - 按下返回键
  5. PRESS_HOME - 按下主页键
  6. PRESS_ENTER - 按下回车键
  7. TASK_COMPLETE - 任务完成
  8. TASK_IMPOSSIBLE - 任务无法完成

暂时无法在飞书文档外展示此内容 让我为您解释这个样本的含义: 这是一个记录用户在宜家(IKEA)安卓应用上购买游戏椅的交互过程的数据样本。让我们从几个关键部分来分析:

  1. 任务描述:
  • instruction: "我想买一把游戏椅。将三个价格最低的产品保存到愿望清单,如果可能的话选择黑色。"
  • episode_id: 一个唯一的标识符,用于追踪这个特定的交互会话
  1. 设备信息:
  • 设备屏幕尺寸为 1440 x 2960 像素
  • 应用包名为 hk.ikea.android(香港宜家安卓应用)
  1. 交互步骤: 整个流程包含16个步骤,主要动作包括:
  • 滑动(SWIPE):用于浏览页面内容
  • 点击(TAP):选择特定项目
  • 输入(TYPE):搜索"gaming chair"
  • 按回车(PRESS_ENTER):确认搜索
  • 任务完成(TASK_COMPLETE):标记任务结束
  1. 交互细节:
  • 每个步骤都记录了精确的触摸坐标(touch_coord)和抬起坐标(lift_coord)
  • 每个步骤都保存了对应的屏幕截图(image_path)
  • 每个步骤都标记了是否需要人工检查(need_human_check)
  1. 流程分析: 用户的操作流程大致是:
  2. 打开应用
  3. 搜索"gaming chair"
  4. 浏览搜索结果
  5. 选择三个价格最低的产品
  6. 将选中的产品添加到愿望清单
  7. 完成任务 动作类型示例 Action Type: SWIPE ==================================================

Sample 1: { "step_id": 1, "action": "SWIPE", "touch_coord": [ 637, 2567 ], "lift_coord": [ 564, 1390 ], "device_dim": [ 1440, 3040 ], "package_name": "com.android.launcher3", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_28_15_40_b31f339a2d6441bb994b4a30f184a267-1.png" }

Action Type: TAP

Sample 1: { "step_id": 3, "action": "TAP", "touch_coord": [ 457, 1633 ], "lift_coord": [ 457, 1633 ], "device_dim": [ 1440, 2960 ], "package_name": "com.android.launcher3", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_22_18_56_6623dffea11a48f2955bafde23a3f1c7-3.png" } Action Type: TYPE

Sample 1: { "step_id": 4, "action": "TYPE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 2960 ], "package_name": "com.amazon.mShop.android.shopping", "type_text": "Nike Basketball Shoes", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_5_14_15_40_ca5bacfe2a574da4ac05174b973cf321-4.png" }

Action Type: PRESS_BACK

Sample 1: { "step_id": 4, "action": "PRESS_BACK", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 2960 ], "package_name": "com.discord", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_5_13_16_3_61ec5c908f0f4969b460fb47bf9eb054-4.png" } Action Type: PRESS_ENTER

Sample 1: { "step_id": 4, "action": "PRESS_ENTER", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3120 ], "package_name": "com.microsoft.office.outlook", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_5_8_15_56_d950b959ebad431a84368d7b63da755d-4.png" }

Action Type: TASK_IMPOSSIBLE

Sample 1: { "step_id": 5, "action": "TASK_IMPOSSIBLE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3120 ], "package_name": "com.espn.score_center", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_26_0_42_87159d38462b4ffdb7fee96370eea8cf-5.png" }

Sample 2: { "step_id": 6, "action": "TASK_IMPOSSIBLE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3040 ], "package_name": "musclebooster.workout.home.gym.abs.loseweight", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_22_16_39_72e6b2de98254c44b9295e165567ac83-6.png" }

Sample 3: { "step_id": 10, "action": "TASK_IMPOSSIBLE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3120 ], "package_name": "com.podcast.podcasts", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_26_15_24_95f480ce21604f1792190e5bfef54549-10.png" }

Sample 4: { "step_id": 15, "action": "TASK_IMPOSSIBLE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3040 ], "package_name": "com.agoda.mobile.consumer", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_28_15_41_f7039f20e9944e92bdada777525a2268-15.png" }

Sample 5: { "step_id": 12, "action": "TASK_IMPOSSIBLE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3040 ], "package_name": "musclebooster.workout.home.gym.abs.loseweight", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_22_16_27_3d79d989a7fd4f22a7cf0ce44ec8d47e-12.png" }

Action Type: PRESS_HOME

Sample 1: { "step_id": 13, "action": "PRESS_HOME", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 2960 ], "package_name": "com.seatgeek.android", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_3_20_14_36_21b637d11bea46b8adb3c2efc9f03501-13.png" }

Action Type: TASK_COMPLETE

Sample 1: { "step_id": 18, "action": "TASK_COMPLETE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3120 ], "package_name": "com.microsoft.teams", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_24_18_25_f100d55037f04908900699457c6f5676-18.png" }

Action Type: TASK_IMPOSSIBLE

Sample 1: { "step_id": 5, "action": "TASK_IMPOSSIBLE", "touch_coord": [ 0, 0 ], "lift_coord": [ 0, 0 ], "device_dim": [ 1440, 3120 ], "package_name": "com.espn.score_center", "type_text": "", "need_human_check": false, "interest_region": [ [ 0, 0 ], [ 0, 0 ] ], "image_path": "2024_4_26_0_42_87159d38462b4ffdb7fee96370eea8cf-5.png" }

如果对你有用的话,可以打赏哦
打赏
ali pay
wechat pay

本文作者:Dong

本文链接:

版权声明:本博客所有文章除特别声明外,均采用 CC BY-NC。本作品采用《知识共享署名-非商业性使用 4.0 国际许可协议》进行许可。您可以在非商业用途下自由转载和修改,但必须注明出处并提供原作者链接。 许可协议。转载请注明出处!