Modules
Kotlin DataFrame is composed of modules, allowing you to include only the functionality you need. In addition, Kotlin DataFrame provides several plugins that significantly enhance the development experience — making it more convenient, powerful, and enjoyable to work with.
Module | Function |
---|---|
General artifact – combines all core and IO artifacts except experimental ones. | |
The DataFrame API and its implementation. | |
Provides support for JSON format writing and reading. | |
Provides support for CSV format writing and reading. | |
Provides support for XSL/XLSX format writing and reading. | |
Provides support for JDBC data sources reading. | |
Provides support for Apache Arrow format writing and reading. | |
Provides a new API for working with geospatial data and IO for geographic formats (GeoJSON, Shapefile). | |
Provides support for OpenAPI JSON format reading and writing. | |
Provides schema generation from OpenAPI specifications. Requires | |
Kotlin compiler plugin. Provides compile-time extension properties generation. | |
Gradle plugin. Provides schemas generation using Gradle. | |
KSP plugin. Provides schemas generation using KSP. |
Configure the repository
All Kotlin DataFrame modules are available from the Maven Central repository. To use them, add the appropriate dependency into your repositories mapping:
dataframe
- general Kotlin DataFrame dependency
General-purpose artifact that includes all core and IO modules.
Does not include any experimental modules.
Recommended if you don’t need fine-grained control over individual module dependencies.
Core Kotlin DataFrame modules
dataframe-core
The core DataFrame API and its implementation.
Includes all core functionality for working with data structures, expressions, schema management, and operations.
IO Kotlin DataFrame modules
dataframe-json
Provides all logic for DataFrame to be able to work with JSON data sources; reading and writing. It's based on Kotlinx Serialization.
dataframe-csv
Provides support for reading and writing CSV files.
Supports standard CSV format features such as delimiters, headers, and quotes.
Based on high-performance Deephaven CSV.
Note that dataframe-json
is included with dataframe-csv
by default. This is to support JSON structures inside CSV files. If you don't need this functionality, you can exclude it like so:
dataframe-excel
Provides support for reading and writing Excel files (.xls
and .xlsx
).
Compatible with standard spreadsheet editors and supports embedded structured data.
Note that dataframe-json
is included with dataframe-excel
by default. This is to support JSON structures inside Excel files. If you don't need this functionality, you can exclude it like so:
dataframe-jdbc
Provides all logic for DataFrame to be able to work with
SQL databases that implement the JDBC protocol.
See Read from SQL databases for more information about how to use it.
dataframe-arrow
Provides all logic and tests for DataFrame to be able to work with Apache Arrow.
See Read Apache Arrow formats and Writing to Apache Arrow formats for more information about how to use it.
Experimental Kotlin DataFrame modules
These modules are experimental and may be unstable.
dataframe-geo
Provides a new API for working with geospatial data, including reading and writing geospatial formats (GeoJSON, Shapefile), and performing geometry-aware operations.
See Geo guide for more details and examples.
Requires OSGeo Repository.
dataframe-openapi
Provides functionality to support auto-generated data schemas from OpenAPI 3.0.0 specifications.
This module is a companion to dataframe-openapi-generator
:
dataframe-openapi-generator
is used internally by the Gradle plugin and Jupyter integration to generate data schemas from OpenAPI specs. In the Gradle plugin, it powers thedataschemas {}
DSL and the@file:ImportDataSchema()
annotation. In Jupyter, it enables theimportDataSchema()
function.dataframe-openapi
must be added as a dependency to the user project in order to use those generated data schemas.
See:
dataframe-openapi-generator
Provides the logic and tooling necessary to import OpenAPI 3.0.0 specifications as auto-generated data schemas for Kotlin DataFrame. This module works in conjunction with dataframe-openapi
:
dataframe-openapi-generator
is used internally by the Gradle plugin and Jupyter integration to generate data schemas from OpenAPI specifications.In Gradle, it enables the
dataschemas {}
DSL and the@file:ImportDataSchema()
annotation.In Jupyter, it powers the
importDataSchema()
function.
dataframe-openapi
must be added as a dependency to the user project to actually use the generated schemas.
See:
Plugins
kotlin.plugin.dataframe
— Kotlin DataFrame Compiler Plugin
Kotlin DataFrame Compiler Plugin enables automatic generation of extension properties and updates data schemas on-the-fly in Gradle projects, making development with Kotlin DataFrame faster, more convenient, and fully type- and name-safe.
To enable the plugin in your Gradle project, add it to the plugins
section:
Due to this issue, incremental compilation must be disabled for now. Add the following line to your gradle.properties
file:
Published as a Kotlin official plugin. Source code is available in the Kotlin repository.
kotlinx.dataframe
– Gradle Plugin
The Gradle plugin allows generating data schemas from samples of data (of supported formats) like JSON, CSV, Excel files, or URLs, as well as from data fetched from SQL databases
using Gradle.
See the Gradle Plugin Reference for installation and usage instructions in Gradle projects.
kotlinx.dataframe:symbol-processor-all
– KSP Plugin
The Gradle plugin allows generating data schemas from samples of data (of supported formats) like JSON, CSV, Excel files, or URLs, as well as from data fetched from SQL databases using Kotlin Symbol Processing (KSP). This is useful for projects where you prefer or require schema generation at the source level.
See Data Schemas in Gradle Projects for usage details.