Neurosignal

Neurosignal https://neurosignal.rootcorp.org Le signal dans le bruit fr Fri, 19 Jun 2026 11:40:22 +0000 🔴 TITRE: VISUALSKILL : Compétences multimodales pour les agents d’utilisation informatique https://arxiv.org/abs/2606.18448 https://arxiv.org/abs/2606.18448 arXiv:2606.18448v1 Announce Type: new Abstract: Computer-use agents (CUAs) approach human-level performance on standardised benchmarks but still struggle on long-horizon tasks and unseen software. Existing skill libraries address this with reusable skills, but represent the skill artifact as text only, despite the visual nature of GUI interaction. We propose VISUALSKILL: a hierarchical multimodal skill, tailored to each target application and organised as a central index over per-topic files, which the agent consumes through a load_topic MCP tool that fetches the relevant topic's text and figures on demand. We construct each skill with a two-stage pipeline that combines authored documentation with live-application UI exploration. On two CUA benchmarks, CUA-World and OSExpert-Eval, a Claude Code CLI agent backed by Claude Opus 4.6 reaches an average score of 0.456 with VISUALSKILL, a +15.3 point absolute lift over the no-skill baseline (0.303). Against a matched text-only skill that is Fri, 19 Jun 2026 11:40:22 +0000