Troubleshooting

When issues arise, follow a systematic approach to identify and resolve the bottleneck.

1. General Health Check

Always start with these baseline commands to rule out resource exhaustion:


bashterminal
gadmin status        # Ensure all services are UP
df -lh               # Check for disk space exhaustion
free -g              # Check for memory pressure
dmesg -T | tail      # Look for OOM (Out of Memory) kills

2. Navigating Log Files

Each service writes detailed logs to the directory defined by System.LogRoot.

Service	Primary Log File	Use Case
GPE	`gpe/log.INFO`	Query execution, memory usage, graph errors.
RESTPP	`restpp/log.INFO`	REST API requests, input validation, loading jobs.
GSQL	`gsql_server_log/GSQL_LOG`	Query compilation and installation errors.
NGINX	`nginx/nginx.access.log`	Connectivity and authentication issues.

3. Common Issues

Slow Query Performance

Huge JSON Response: If GPE CPU is high but execution is finished, your result size may be too large.
Insufficient Memory: Check if the system is swapping to disk.
Logic Bottlenecks: Verify if your GSQL is traversing more edges than necessary.

Services "Not Ready"

If a service like GSE is stuck in not_ready, it is usually "warming up" (loading data from disk to RAM). Check CPU usage to confirm activity.

Cluster Out of Sync

TigerGraph requires clocks across all nodes to be synchronized within 2 seconds. If they drift, schema changes and loading jobs will fail.


bashterminal
grun all "date"  # Compare time across nodes

4. Collecting Support Data

If you need to contact TigerGraph support, use the gcollect tool to gather all relevant logs and configs into a single bundle:


bashterminal
gcollect collect

[!IMPORTANT] Query Abortion: If queries are being aborted unexpectedly, check the GPE log for the message System Memory in Critical state. This indicates the system is protecting itself from an OOM crash.

BOOK