Ops Orchestrator Agent
Ops Orchestrator Multi-Agent System
A comprehensive multi-agent system built on AWS Bedrock AgentCore that provides automated incident triaging, ChatOps collaboration, and report generation for operational workflows.
Quick Start Guide
Prerequisites
First, ensure you have completed all prerequisite setup as outlined in the Prerequisites section below.
Local Deployment Steps
- Configure and Launch the Agent Runtime
python ops_orchestrator_runtime.py --configure --launch
This will:
- Configure the AgentCore runtime environment
- Set up authentication and gateway connections
- Launch the ops orchestrator agent with runtime capabilities
- Navigate to Parent Directory and Invoke Agent
cd .. python invoke_agent.py
-
Agent Invocation Options You can invoke the agent using either:
Option A: HTTP/REST API
- Use standard HTTP requests to interact with the agent
- The agent will be accessible via the configured gateway endpoint
Option B: AWS SDK (boto3)
- Use the AWS Bedrock AgentCore SDK for direct agent invocation
- Requires your agent ARN for programmatic access
- Example:
import boto3 client = boto3.client('bedrock-agentcore') response = client.invoke_agent( agentId='your-agent-arn', message='Your incident description here' )
Summary of Deployment Process
- ✅ Complete prerequisites setup
- ✅ Run
python ops_orchestrator_runtime.py --configure --launch
- ✅ Navigate up one directory:
cd ..
- ✅ Invoke agent:
python invoke_agent.py
- ✅ Use HTTP or boto3 for agent communication
Architecture Overview
The Ops Orchestrator Agent is a sophisticated multi-agent system that consists of three specialized agents working collaboratively:
- Lead Agent (Issue Triaging) - Automated incident analysis and classification
- ChatOps Agent - Real-time collaboration through Teams, Slack, and Gmail
- Ticket Creator Agent - Automated ticket creation in JIRA and other systems
Each agent leverages AWS Bedrock AgentCore memory primitives and connects to external services through an MCP (Model Context Protocol) gateway with OAuth2 authentication.
Prerequisites
AWS Requirements
- AWS CLI configured with appropriate permissions
- Access to AWS Bedrock AgentCore services
- IAM permissions for:
bedrock:*
bedrock-agentcore:*
s3:*
lambda:*
iam:*
cognito-idp:*
secretsmanager:*
logs:*
cloudwatch:*
Python Dependencies
pip install boto3 pyyaml python-keycloak requests openai anthropic
External Service Authentication
You’ll need API credentials for the services you want to integrate:
- JIRA: Username and API token
- GitHub: Personal Access Token or OAuth app credentials
- Slack: Bot token (optional)
Environment Setup
Create a .env
file or export the following environment variables:
# AWS Configuration
export AWS_REGION="us-east-1"
export AWS_ACCOUNT_ID="your-account-id"
# JIRA Integration (Required)
export JIRA_USERNAME="your-jira-username"
export JIRA_API_TOKEN="[REDACTED]"
export JIRA_DOMAIN="yourcompany.atlassian.net"
# GitHub Integration (Required)
export GITHUB_TOKEN="[REDACTED]"
# Optional: GitHub OAuth (for advanced features)
export GITHUB_CLIENT_ID="your-oauth-client-id"
export GITHUB_CLIENT_SECRET="[REDACTED]"
# Optional: JIRA OAuth (for advanced features)
export JIRA_CLIENT_ID="your-jira-oauth-client-id"
export JIRA_CLIENT_SECRET="[REDACTED]"
# Optional: Keycloak Authentication (Alternative to Cognito)
export KEYCLOAK_URL="http://localhost:8080/"
export KEYCLOAK_ADMIN_USER="admin"
export KEYCLOAK_ADMIN_PASS="[REDACTED]"
Configuration Setup
The system uses a config.yaml
file for configuration. Here’s the essential structure:
general:
name: "ops-orchestrator-agent"
description: "Multi-agent system for operations orchestration"
agent_information:
ops_orchestrator_agent_model_info:
model_id: gpt-4o-2024-08-06
inference_parameters:
temperature: 0.1
max_tokens: 2048
# Memory configuration for each agent
memories:
lead_agent:
use_existing: false # Set to true if you have existing memory
memory_id: null # Fill if reusing existing memory
chat_ops_agent:
use_existing: false
memory_id: null
ticket_agent:
use_existing: false
memory_id: null
# Gateway configuration
gateway_config:
name: "ops-gw"
# Authentication method (choose one)
inbound_auth:
type: "cognito" # or "keycloak"
cognito:
create_user_pool: true
user_pool_name: "agentcore-gateway-ops"
resource_server_id: "ops_orchestrator_agent"
resource_server_name: "agentcore-gateway-ops"
client_name: "agentcore-client-ops"
scopes:
- ScopeName: "gateway:read"
ScopeDescription: "Read access"
- ScopeName: "gateway:write"
ScopeDescription: "Write access"
credentials:
use_cognito: true
use_existing: false
create_new_access_token: false
gateway_id: null
mcp_url: null
access_token: null
# S3 bucket for storing API specifications
bucket_name: "ops-orchestrator-gateway-bucket"
# Target integrations
targets:
- name: "jira-integration"
spec_file: /absolute/path/to/tools/jira_api_spec.yaml
type: "openapi"
api_type: "jira"
endpoint: "https://your-jira-instance.atlassian.net"
authentication:
type: "basic"
credentials:
username: "${JIRA_USERNAME}"
password: "${JIRA_API_TOKEN}"
- name: "github-integration"
spec_file: /absolute/path/to/tools/github_api_spec.yaml
type: "openapi"
api_type: "github"
endpoint: "https://api.github.com"
authentication:
type: "bearer"
credentials:
token: "${GITHUB_TOKEN}"
Authentication Options
Option 1: AWS Cognito (Default)
The system automatically creates:
- Cognito User Pool for authentication
- Resource server with custom scopes
- Machine-to-machine client for API access
- Access tokens for gateway authentication
Option 2: Keycloak Authentication
For organizations using Keycloak for identity management:
- Start Keycloak (if running locally):
docker run -p 8080:8080 \ -e KEYCLOAK_ADMIN=admin \ -e KEYCLOAK_ADMIN_PASSWORD=[REDACTED] \ quay.io/keycloak/keycloak:latest start-dev
- Update config.yaml:
```yaml
inbound_auth:
type: “keycloak”
keycloak:
url: “${KEYCLOAK_URL}”
admin_user: “${KEYCLOAK_ADMIN_USER}”
admin_pass: “${KEYCLOAK_ADMIN_PASS}”
realm_name: “ops-orchestrator-realm”
client_id: “ops-orchestrator-gateway-client”
create_realm: true
scopes:
- “gateway:read”
- “gateway:write”
- “ops:manage”
- “incidents:create”
credentials: use_keycloak: true
## Service Integrations
### JIRA Integration
The system integrates with JIRA for automated ticket creation and management.
**Required Setup:**
1. Create a JIRA API token in your Atlassian account
2. Set environment variables:
```bash
export JIRA_USERNAME="your-email@company.com"
export JIRA_API_TOKEN="[REDACTED]"
export JIRA_DOMAIN="yourcompany.atlassian.net"
GitHub Integration
Integrates with GitHub for repository management and issue tracking.
Required Setup:
- Create a GitHub Personal Access Token with appropriate scopes
- Set environment variable:
export GITHUB_TOKEN="[REDACTED]"
Agent Memory System
Each agent uses AWS Bedrock AgentCore memory with different strategies:
Lead Agent Memory
- User Preferences: Stores user-specific incident handling preferences
- Semantic Memory: Contextual understanding of technical issues
- Summary Memory: Session-based conversation summaries
- Custom Issue Triaging: Specialized memory for incident classification
ChatOps Agent Memory
- User Preferences: Communication preferences and channels
- Semantic Memory: Chat context and collaboration patterns
- Summary Memory: Chat session summaries
- ChatOps Memory: Communication templates and escalation procedures
Ticket Creator Agent Memory
- User Preferences: Ticket creation preferences and templates
- Semantic Memory: Ticket patterns and classifications
- Summary Memory: Ticket creation session history
- Ticket Creator Memory: Template management and field mapping
Memory Configuration
memories:
lead_agent:
use_existing: false # Set to true to reuse existing memory
memory_id: null # Memory ID if reusing
chat_ops_agent:
use_existing: false
memory_id: null
ticket_agent:
use_existing: false
memory_id: null
Troubleshooting
Common Issues
Memory Creation Fails
❌ Error creating memory for agent: AccessDenied
Solution: Check AWS permissions for bedrock-agentcore:*
Gateway Creation Fails
❌ Error creating gateway: ValidationException
Solutions:
- Verify AWS region supports Bedrock AgentCore
- Check IAM role permissions
- Validate authentication configuration
JIRA/GitHub Integration Fails
❌ Target creation failed: Authentication failed
Solutions:
- Verify API credentials are correct
- Check environment variables are exported
- Validate API endpoint URLs
- Ensure API tokens have required permissions
Debug Mode
Enable detailed logging:
export PYTHONPATH=.
python -c "
import logging
logging.basicConfig(level=logging.DEBUG)
exec(open('ops_orchestrator_multi_agent.py').read())
"
Security Best Practices
Credential Management
- Use environment variables, not hardcoded credentials
- Rotate API tokens regularly
- Use IAM roles with minimal required permissions
Network Security
- Use HTTPS for all external API calls
- Implement VPC endpoints for AWS services
- Consider private subnets for production deployments
Access Control
- Implement least-privilege IAM policies
- Use OAuth2 scopes to limit API access
- Regularly audit service integrations
Success Indicators
When properly configured, you should see:
✅ Observability initialized
✅ Created memory for lead_agent: OpsAgent_mem_xxx
✅ Created memory for chat_ops_agent: OpsAgent_chat_xxx
✅ Created memory for ticket_agent: TicketCreation_chat_xxx
✅ Gateway setup completed with URL: https://xxxxx
✅ Created 2 targets successfully
🚀 Ops orchestrator multi-agent system ready!
Your multi-agent system is now ready to handle operational incidents, create tickets, and collaborate across your organization! 🚀