Skip to main content

On This Page

Introduction to BaseX XML Database and Its Features

4 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Introduction to BaseX XML Database and Its Features

BaseX is a lightweight, high-performance XML database designed for storing, querying, and modifying large volumes of XML data using XQuery and XPath. It supports multiple interfaces, including a command-line interface (CLI), GUI, and HTTP APIs, enabling both local and remote access. This summary details its core features, setup, and usage patterns.

Key Features and Functionality

1. Overview of BaseX

  • Purpose: Store and manage XML data efficiently, with support for querying and manipulation.
  • Supported Standards: XQuery 3.1 and XPath 2.0.
  • Scalability: Designed to handle large datasets with high performance.
  • Interfaces: CLI, GUI, and HTTP APIs for flexibility in access and deployment.

2. Installation and Setup

  • Java Requirement: Requires Java 17 or newer JVM.
  • Installation Package: Download ZIP from the official site, which includes:
    • bin/: Scripts for starting the database.
    • data/: Directory for storing databases.
    • lib/ and webapp/: Core database software and web components.
    • src/ and repo/: XQuery scripts and module sources.
  • Multiple Databases: Supports managing multiple databases via separate directories in data/.

3. Command-Line Interface (CLI)

  • Use Cases: Administrative tasks and quick queries; not suitable for concurrent access.
  • Key Commands:
    • CREATE DATABASE <name> [URL]: Creates a new database, optionally loading data from a URL.
      • Example: CREATE DATABASE baeldung https://files.basex.org/xml/factbook.xml
    • LIST: Displays existing databases and their metadata.
    • OPEN <database>: Switches context to a specific database.
    • INFO DATABASE: Shows properties of the current database (size, nodes, timestamp, etc.).
    • ADD [TO <name>] <URL>: Adds XML files to the database.
    • DELETE <resource>: Removes a file from the database.
    • XQUERY <query>: Executes XQuery against the database.
      • Example: XQUERY //item/name returns all <name> elements from all files.

4. HTTP Interface

  • Functionality: Exposes databases via RESTful endpoints for remote access.
  • Starting the Server:
    • Run ./bin/basexhttp to start the HTTP server on ports 1984, 8080, and 8081.
    • Default admin credentials are generated in /data/.logs/ (e.g., 29a058d7-9dc3-4d7a-bab1-8380eca78e42).
  • Authentication: Uses HTTP Basic Auth with the default admin user.
  • REST Endpoints:
    • GET /rest: Lists all databases.
    • GET /rest/<database>: Lists resources in a database.
    • GET /rest/<database>/<resource>: Retrieves a specific XML file.
    • GET /rest/<database>?query=<XQuery>: Executes a query on the database.
    • POST /rest/<database>: Submits complex XQuery via POST.
    • PUT /rest/<database>/<resource>: Creates or updates a resource.
    • DELETE /rest/<database>/<resource>: Deletes a resource.

5. Performance and Metrics

  • Speed: Example operations like creating a database take 18.82 ms (empty) or 734.75 ms (with remote data).
  • Concurrency: CLI is not thread-safe; HTTP server handles concurrent access.
  • Data Size: Databases can scale to large XML files (e.g., 2251 kB in the example).

Working Example (CLI and HTTP)

CLI: Creating and Querying a Database

# Start BaseX CLI
./bin/basex

# Create a new database from a remote URL
CREATE DATABASE baeldung https://files.basex.org/xml/factbook.xml

# Open the database
OPEN baeldung

# Query for all <name> elements
XQUERY //item/name

HTTP: Querying via REST API

Request:

GET /rest/baeldung?query=//item/name HTTP/1.1
Authorization: Basic YWRtaW46TmV3UGFzc3dvcmQ=

Response:

<name>duteous nine eighteen </name>
<name>great </name>
...
<name>holds perhaps despair amorous </name>

Recommendations

  • CLI Best Practices:

    • Use for administrative tasks or ad-hoc queries, not for concurrent access.
    • Always verify the database context with INFO DATABASE after switching.
    • Secure the admin password using PASSWORD <new_password>.
  • HTTP Interface Best Practices:

    • Use HTTPS in production to encrypt data.
    • Regularly update the admin password and restrict access to trusted IPs.
    • For complex queries, use POST with XML-encoded XQuery (e.g., for $i in (//city/name)[position() <= 5] return string-length($i)).
  • When to Use BaseX:

    • Applications requiring high-performance XML storage and querying.
    • Scenarios with large, unstructured XML datasets (e.g., scientific data, document management).
  • Pitfalls to Avoid:

    • Do not rely on CLI for concurrent operations; use HTTP APIs instead.
    • Avoid hardcoding admin credentials in scripts; store them securely.
    • Ensure Java 17+ is installed to avoid runtime errors.

Reference: Baeldung - Introduction to BaseX

Continue reading

Next article

Securing Google Workspace: Best Practices for Lean Security Teams

Related Content