Cassandra

Cassandra Endpoint Documentation

OverviewCopied!

Cassandra endpoints in Eden provide a standardized interface for interacting with Apache Cassandra and compatible databases like DataStax Enterprise and ScyllaDB. These endpoints support CQL (Cassandra Query Language) operations with specialized handling for Cassandra's distributed architecture and unique consistency model.

Key FeaturesCopied!

  • Distributed Query Support: Optimized for Cassandra's distributed architecture

  • Consistency Level Control: Fine-grained control over read/write consistency

  • Automated Pagination: Intelligent handling of Cassandra's pagination mechanisms

  • Batch Operations: Support for atomic multi-statement operations

  • Performance Optimization: Connection pooling and query preparation

API Request TypesCopied!

Cassandra endpoints support three primary operation types:

1. Batch Operations

Used for executing multiple CQL statements in a single atomic batch, ensuring they are either all applied or none at all.

Required Properties:

  • kind: Must be "cassandra"

  • type: Must be "batch"

  • queries: An array of CQL statement strings to be executed in the batch

Example Request:

{
  "kind": "cassandra",
  "type": "batch",
  "queries": [
    "INSERT INTO users (id, name, email) VALUES (uuid(), 'John Doe', 'john@example.com')",
    "UPDATE user_stats SET login_count = login_count + 1 WHERE user_id = '123e4567-e89b-12d3-a456-426614174000'"
  ]
}

Response:

{
  "success": true,
  "batchApplied": true,
  "metadata": {
    "executionTime": 12,
    "hostUsed": "cassandra-node-3.example.com",
    "consistencyAchieved": "LOCAL_QUORUM"
  }
}

Use Cases:

  • Atomic multi-table updates

  • Transaction-like behavior for related operations

  • Bulk updates where all-or-nothing semantics are required

Limitations:

  • Best used for operations within the same partition

  • Limited to reasonable batch sizes (under 50 statements)

  • Not a substitute for true ACID transactions

2. Single Page Query

For executing a CQL query that returns a single page of results, with support for pagination controls.

Required Properties:

  • kind: Must be "cassandra"

  • type: Must be "query_single_page"

  • query: A CQL query string

Optional Properties:

  • parameters: Array of parameter values to bind to placeholders

  • pageSize: Number of rows to return in a single page

  • pagingState: Token that represents the current position in pagination

  • consistencyLevel: Desired consistency level for the query

Example Request:

{
  "kind": "cassandra",
  "type": "query_single_page",
  "query": "SELECT * FROM products WHERE category = ? LIMIT 100",
  "parameters": ["electronics"],
  "pageSize": 25,
  "consistencyLevel": "LOCAL_ONE"
}

Response:

{
  "data": [
    {
      "id": "prod-123",
      "name": "Wireless Headphones",
      "category": "electronics",
      "price": 89.99,
      "inventory": 42
    },
    // ...more rows
  ],
  "metadata": {
    "executionTime": 18,
    "rowCount": 25,
    "hasMorePages": true,
    "pagingState": "AENvbHVtbgAAC2NyZWF0ZWRfdGltZQNQYXJ0aXRpb25LZXkAAA=="
  }
}

Use Cases:

  • Reading data with a known limit

  • UI-driven paged views of data

  • When you need to process one page of results at a time

3. Unpaged Query

For executing a CQL query that automatically handles pagination internally and returns all matching results.

Required Properties:

  • kind: Must be "cassandra"

  • type: Must be "query_unpage"

  • query: A CQL query string

Optional Properties:

  • parameters: Array of parameter values to bind to placeholders

  • maxRows: Maximum number of rows to return (safety limit)

  • consistencyLevel: Desired consistency level for the query

Example Request:

{
  "kind": "cassandra",
  "type": "query_unpage",
  "query": "SELECT * FROM orders WHERE status = ?",
  "parameters": ["pending"],
  "maxRows": 10000,
  "consistencyLevel": "QUORUM"
}

Response:

{
  "data": [
    {
      "order_id": "ord-456",
      "customer_id": "cust-789",
      "status": "pending",
      "total": 129.99,
      "created_at": "2025-05-10T14:30:45Z"
    },
    // ...all matching rows
  ],
  "metadata": {
    "executionTime": 145,
    "rowCount": 327,
    "pagesRetrieved": 4
  }
}

Use Cases:

  • Retrieving complete result sets that might span multiple pages

  • Exporting data for processing

  • Operations where you need all matching records regardless of quantity

Caution:

  • Use with care on potentially large result sets

  • Always set a reasonable maxRows value to prevent memory issues

  • Consider using query_single_page for very large tables

Endpoint ConfigurationCopied!

Basic Configuration Example

{
  "name": "ProductionCassandra",
  "type": "cassandra",
  "description": "Main product catalog database",
  "connection": {
    "contactPoints": ["cassandra-1.example.com", "cassandra-2.example.com"],
    "port": 9042,
    "keyspace": "product_catalog",
    "localDatacenter": "us-east"
  },
  "authentication": {
    "type": "username_password",
    "username": "${ENV_CASSANDRA_USER}",
    "password": "${ENV_CASSANDRA_PASSWORD}"
  },
  "settings": {
    "consistencyLevel": "LOCAL_QUORUM",
    "serialConsistencyLevel": "LOCAL_SERIAL",
    "requestTimeout": 10000
  }
}

Advanced Configuration Example

{
  "name": "AnalyticsCassandra",
  "type": "cassandra",
  "description": "Analytics data store with advanced settings",
  "connection": {
    "contactPoints": ["cass-an-01.example.com", "cass-an-02.example.com", "cass-an-03.example.com"],
    "port": 9042,
    "keyspace": "analytics",
    "localDatacenter": "us-west"
  },
  "authentication": {
    "type": "username_password",
    "username": "${ENV_CASSANDRA_USER}",
    "password": "${ENV_CASSANDRA_PASSWORD}"
  },
  "settings": {
    "consistencyLevel": "LOCAL_QUORUM",
    "serialConsistencyLevel": "LOCAL_SERIAL",
    "requestTimeout": 10000,
    "poolingOptions": {
      "coreConnectionsPerHost": 2,
      "maxConnectionsPerHost": 8,
      "maxRequestsPerConnection": 1024
    }
  },
  "advanced": {
    "queryOptions": {
      "prepareOnAllHosts": true,
      "fetchSize": 5000
    },
    "socketOptions": {
      "connectTimeoutMillis": 5000,
      "readTimeoutMillis": 12000
    },
    "retryPolicy": "DEFAULT",
    "speculativeExecutionPolicy": {
      "type": "PERCENTILE",
      "percentile": 99,
      "maxExecutions": 3
    },
    "metrics": {
      "enabled": true,
      "jmxReporting": false
    }
  }
}

Configuration Options ReferenceCopied!

Connection Options

Option

Type

Description

contactPoints

Array

List of Cassandra nodes to connect to

port

Number

Port to connect to (default: 9042)

keyspace

String

Default keyspace to use

localDatacenter

String

Name of the local datacenter

protocolVersion

Number

Cassandra protocol version to use

Authentication Options

Option

Type

Description

type

String

Authentication type: "username_password", "kerberos", or "none"

username

String

Username for authentication

password

String

Password for authentication

servicePrincipal

String

Kerberos service principal (for Kerberos auth)

Settings Options

Option

Type

Description

consistencyLevel

String

Default consistency level for queries

serialConsistencyLevel

String

Consistency level for conditional updates

requestTimeout

Number

Request timeout in milliseconds

poolingOptions

Object

Connection pool configuration

compression

String

Compression algorithm: "LZ4", "SNAPPY", or "NONE"

Advanced Options

Option

Type

Description

queryOptions

Object

Query-specific options

socketOptions

Object

Socket connection options

retryPolicy

String/Object

Policy for retrying failed operations

speculativeExecutionPolicy

Object

Configuration for speculative executions

metrics

Object

Metrics collection configuration

Best PracticesCopied!

Query Optimization

  1. Use Prepared Statements:

    • The driver automatically prepares statements, but ensure your queries are parameterized

    • Use bind markers (?) instead of string concatenation

  2. Partitioning Strategy:

    • Design tables with query patterns in mind

    • Include the partition key in WHERE clauses

    • Avoid large partitions that can cause hotspots

  3. Avoid Anti-Patterns:

    • Don't use ALLOW FILTERING in production queries

    • Avoid collection-based filtering without secondary indexes

    • Don't use ORDER BY on non-clustering columns

Consistency Considerations

  1. Read Consistency:

    • Use LOCAL_ONE for low-latency, non-critical reads

    • Use LOCAL_QUORUM for consistent reads within a datacenter

    • Use QUORUM for cross-datacenter consistency

  2. Write Consistency:

    • Use LOCAL_QUORUM for most writes

    • Use QUORUM for critical writes that must be cross-datacenter consistent

    • Use ALL only when absolute consistency is required

Batch Usage Guidelines

  1. Use Cases:

    • Only use batches for operations within the same partition

    • Ideal for multi-table updates that must be atomic

  2. Limitations:

    • Keep batches small (under 50 statements)

    • Avoid mixing conditional and non-conditional updates

    • Don't use batches just for performance (they can be slower)

Pagination Strategies

  1. UI Pagination:

    • Use query_single_page with a consistent page size

    • Save the pagingState token between requests

    • Include a reasonable page size that matches your UI

  2. Data Processing:

    • Use query_unpage for datasets under 10,000 rows

    • For larger datasets, implement manual pagination with query_single_page

    • Always set a maximum row limit to prevent memory issues

TroubleshootingCopied!

Common Issues and Solutions

  1. Connection Failures:

    • Verify network connectivity to contact points

    • Check authentication credentials

    • Ensure the specified keyspace exists

    • Verify port settings (default is 9042)

  2. Timeout Errors:

    • Increase request timeout settings

    • Check for network latency issues

    • Verify Cassandra cluster health

    • Consider using a lower consistency level

  3. Consistency Errors:

    • Check the number of available replicas

    • Verify datacenter configuration

    • Consider using a lower consistency level temporarily

    • Check for node failures in the cluster

Monitoring Recommendations

  1. Connection Metrics:

    • Monitor active connections

    • Track connection error rates

    • Set alerts for connection pool exhaustion

  2. Query Performance:

    • Monitor average query latency

    • Track slow queries

    • Set up alerting for query timeouts

  3. Health Checks:

    • Implement periodic connectivity tests

    • Monitor Cassandra node status

    • Set up endpoint health dashboards