NeuralAgent: An Open-Source AI Agent for Native Desktop Automation

7月28日 Published inAutomation Tools

NeuralAgent is a native desktop assistant designed to execute complex workflows through simple natural language commands. Unlike traditional chatbots, it interacts directly with your operating system—simulating keystrokes, managing mouse movements, navigating browsers, filling out forms, and sending emails. The agent is capable of operating in the foreground or running tasks silently in the background.

Desktop automation is powered by pyautogui, while background browser orchestration is currently supported on Windows via the Windows Subsystem for Linux (WSL). The system is highly flexible, integrating with a wide variety of model providers including Claude, GPT-4, Azure OpenAI, Amazon Bedrock, Ollama, and Google Gemini. A suite of modular agents—specializing in planning, classification, and task suggestion—analyzes both text input and real-time screen content to determine the optimal next step. Built on a FastAPI backend with an Electron and React frontend, the entire stack is fully customizable.

Official site: www.getneuralagent.com

Key Features

Desktop Automation: Native control via pyautogui.
Background Processes: Browser-focused automation for Windows users via WSL.
Broad Model Support: Integration with Claude, GPT-4, Azure OpenAI, Bedrock, Ollama, and Gemini.
Modular Architecture: Dedicated agents for planning, classification, task suggestion, and title generation.
Multimodal Perception: The agent interprets on-screen visuals alongside user instructions.
Modern Stack: Powered by FastAPI, Electron, and React.

Prerequisites

Before beginning the installation, ensure the following software is installed on your system:

Tool	Purpose	Recommended Version
Python	Backend and local AI agent processes	>= 3.9
PostgreSQL	Relational database for data persistence	>= 13
Node.js + npm	Electron and React frontend components	Node >= 18, npm >= 9

Installation Links:

Python: www.python.org/downloads/
PostgreSQL: www.postgresql.org/download/
Node.js: nodejs.org/en/download

NeuralAgent is compatible with Windows, macOS, and Linux. Note that background browser control via WSL is currently exclusive to Windows.

Installation and Setup

You will need two terminal windows: one to host the backend server and another for the desktop application.

Backend Configuration

Initialize a virtual environment (recommended):

cd backend
python -m venv venv
# Activation:
# macOS/Linux: source venv/bin/activate
# Windows: venv\Scripts\activate

Install the required dependencies:
```
pip install -r requirements.txt
```
Database Setup: Create a local PostgreSQL database. Ensure the PostgreSQL service is running before proceeding.

Environment Configuration: Copy .env.example to a new file named .env and provide your specific credentials:

DB_HOST=
DB_PORT=
DB_DATABASE=
DB_USERNAME=
DB_PASSWORD=   # Leave blank if not required
DB_CONNECTION_STRING=
JWT_ISS=NeuralAgentBackend
JWT_SECRET=   # Generate a unique random string
REDIS_CONNECTION=   # Optional

# Amazon Bedrock
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
BEDROCK_REGION=us-west-2

# Azure OpenAI
AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_API_KEY=
OPENAI_API_VERSION=2024-12-01-preview

# OpenAI / Anthropic
OPENAI_API_KEY=
ANTHROPIC_API_KEY=

# Google Gemini
GOOGLE_API_KEY=

# Ollama (Local)
OLLAMA_URL=http://127.0.0.1:11434

# Agent Model Assignments
CLASSIFIER_AGENT_MODEL_TYPE=openai|azure_openai|anthropic|bedrock|ollama|gemini
CLASSIFIER_AGENT_MODEL_ID=gpt-4.1
TITLE_AGENT_MODEL_TYPE=openai|azure_openai|anthropic|bedrock|ollama|gemini
TITLE_AGENT_MODEL_ID=gpt-4.1-nano
SUGGESTOR_AGENT_MODEL_TYPE=openai|azure_openai|anthropic|bedrock|ollama|gemini
SUGGESTOR_AGENT_MODEL_ID=gpt-4.1-mini
PLANNER_AGENT_MODEL_TYPE=openai|azure_openai|anthropic|bedrock|ollama|gemini
PLANNER_AGENT_MODEL_ID=gpt-4.1
COMPUTER_USE_AGENT_MODEL_TYPE=openai|azure_openai|anthropic|bedrock|ollama|gemini
COMPUTER_USE_AGENT_MODEL_ID=us.anthropic.claude-sonnet-4-20250514-v1:0

# Screenshot logging for training (Disabled by default)
ENABLE_SCREENSHOT_LOGGING_FOR_TRAINING=false
AWS_DEFAULT_REGION=us-east-1
AWS_BUCKET=

# LangSmith Tracing
LANGCHAIN_TRACING_V2=false
LANGCHAIN_ENDPOINT=
LANGCHAIN_API_KEY=
LANGCHAIN_PROJECT=

# Google Authentication (Optional)
GOOGLE_LOGIN_CLIENT_ID=
GOOGLE_LOGIN_CLIENT_SECRET=
GOOGLE_LOGIN_DESKTOP_REDIRECT_URI=http://127.0.0.1:36478

Apply Database Migrations:
```
alembic upgrade head
```

Launch the Backend Server:

uvicorn main:app --reload --host 0.0.0.0 --port 8000

Frontend and Desktop Application Setup

Install Electron dependencies:
```
cd desktop
npm install
```
Configure the React application:
```
cd neuralagent-app
npm install
```

Set Frontend Environment Variables: Copy .env.example to .env within the neuralagent-app directory:

REACT_APP_PROTOCOL=http
REACT_APP_WEBSOCKET_PROTOCOL=ws
REACT_APP_DNS=127.0.0.1:8000
REACT_APP_API_KEY=

Return to the desktop root directory:
```
cd ..
```

Initialize the AI Agent Service (Python):

cd aiagent
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install -r requirements.txt
deactivate

Start the Desktop Application:
```
cd ..
npm start
```

Specialized Agents and Providers

NeuralAgent allows you to delegate specific tasks to different LLM providers by modifying the .env file. You can mix and match providers like OpenAI, Anthropic, or local Ollama instances based on your needs for speed or privacy.

Available Agent Types:

Planner Agent: Formulates the steps required to complete a task.
Classifier Agent: Categorizes the type of intent or data processed.
Title Agent: Generates descriptive titles for active sessions.
Suggestor Agent: Recommends follow-up actions or improvements.
Computer Use Agent: The core engine that handles direct mouse and keyboard interaction.