Training AI to Navigate Interfaces as Humans Do
Abstract: Current browser automation approaches rely on brittle DOM-based interactions or pure visual approaches that do not understand the purpose and relationships of page elements. By capturing and learning from authentic browser interactions, we demonstrate how a hierarchical system of specialized AI components can develop both low-level mechanics and high-level intent understanding. We present concrete data structures for training such systems, methods for task decomposition, element recognition, and sequence prediction, and suggest a hybrid training methodology that learns from both aggregated patterns and raw interactions. Our approach promises a robust, universal solution for AI-browser interaction that remains effective even as websites change.
Keywords: browser automation, AI training, human-computer interaction, web interfaces, hierarchical AI, generic user framework, DOM interaction,
sequence prediction
Publication Date: August 2025
Status: Preprint
Enhancing Browser Automation with Contextual Awareness
Abstract: This paper proposes the Memory Agent, an extension to the Generic User Approach (GUA) framework for AI-browser interaction. The Memory Agent serves as a dedicated system for maintaining contextual awareness across browsing sessions, storing interaction patterns, and providing relevant historical information to other agents in the system. We explore the potential architecture, implementation considerations, and benefits of incorporating such an agent into browser automation systems. The Memory Agent represents a step toward more human-like browsing capabilities by addressing the critical aspect of memory that enables humans to leverage past experiences when navigating interfaces.
Keywords: AI memory systems, contextual awareness, browser automation, persistent cognition, agent architecture, memory consolidation,
cross-session learning
Publication Date: August 2025
Status: Preprint
Toward Persistent Cognitive Architectures for AI Systems
persistent cognitive processing
Keywords: persistent cognitive processing, AI architecture, background cognition, insight generation, memory systems, human-AI collaboration,
continuous learning, cognitive threads
Publication Date: August 2025
Status: Preprint