Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions README-zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,69 @@ TsFile、CSV 和 Parquet 三种文件格式的比较
[C++](./cpp/README-zh.md)

[Python](./python/README-zh.md)

## 命令行工具(tsfile-cli)

Apache TsFile 提供了命令行工具 `tsfile-cli`,这是一个单文件、对管道友好的工具,可直接在
shell 中查看**并**导入 `.tsfile` 文件。读取类命令
(`ls`、`meta`、`schema`、`stats`、`count`、`head`、`cat`、`sample`)将数据输出到 stdout、
诊断信息输出到 stderr,因此可与 `awk`、`jq`、`sort` 等工具组合使用;`write` 命令则将 CSV/TSV
导入为新的 `.tsfile`。支持的输出格式:`csv`、`tsv`、`json`(NDJSON)、`table`。

### 命令

| 命令 | 作用 |
|---|---|
| `ls` | 列出文件中的表(表模型)或设备(树模型),每行一个名称 |
| `meta` | 文件概览:数据模型、表 / 设备 / 序列数量、时间范围、文件大小 |
| `schema` | 每条序列的数据类型、编码、压缩方式 |
| `stats` | 每条序列的统计信息:行数、时间范围、最小 / 最大值、首 / 末值、求和 |
| `count` | 每条序列的行数及总计 —— 直接读取统计信息,不扫描数据页 |
| `head` | 输出前 N 行(默认 10,可用 `-n` 调整) |
| `cat` | 流式输出所有匹配的行 |
| `sample` | 对行做可复现的蓄水池采样(`-n`、`--seed`) |
| `write` | 将 CSV/TSV 导入为新的表模型 `.tsfile` |

其中元数据类命令(`ls`、`meta`、`schema`、`stats`、`count`)无需解码数据页即可回答大部分问题,
而 `head`、`cat`、`sample` 则会读取真实的行数据。

### 示例

```bash
tsfile-cli ls data.tsfile # 列出表 / 设备
tsfile-cli meta data.tsfile # 文件概览(模型、数量、时间范围、大小)
tsfile-cli head -n 20 data.tsfile # 前 20 行
tsfile-cli cat -m temp,humidity -f csv data.tsfile # 以 CSV 流式输出指定列

# 将 CSV/TSV 导入为新的表模型 .tsfile
printf 'time,id1,s1\n0,dev,0\n1,dev,10\n' \
| tsfile-cli write --table t1 --columns "id1:STRING:tag,s1:INT64:field" -o out.tsfile -
```

### 构建

> **平台支持。** 目前 `tsfile-cli` 仅支持在 **Linux 和 macOS** 上从源码编译。后续我们会单独
> 发布该工具的预编译版本。

`tsfile-cli` 随 C++ 模块一起构建,因此在仓库根目录用 Maven 编译 C++ 模块时,它会一并包含在产物中:

```bash
./mvnw clean package -P with-cpp
```

生成的可执行文件位于 `cpp/target/build/bin/tsfile-cli`,它所依赖的共享库 `libtsfile` 位于
`cpp/target/build/lib/`(Linux 为 `libtsfile.so`,macOS 为 `libtsfile.dylib`)。`tsfile-cli`
在运行时会加载 `libtsfile`,因此使用时需要让动态链接器能找到该库 —— 可以把它保留在
`cpp/target/build/lib` 下并将该目录加入库搜索路径,或把 `libtsfile` 复制到可执行文件旁边(或系统库目录):

```bash
# Linux
export LD_LIBRARY_PATH=cpp/target/build/lib:$LD_LIBRARY_PATH
# macOS
export DYLD_LIBRARY_PATH=cpp/target/build/lib:$DYLD_LIBRARY_PATH

cpp/target/build/bin/tsfile-cli --version
cpp/target/build/bin/tsfile-cli --help
```

完整的命令与选项说明见 [`cpp/tools/README.md`](./cpp/tools/README.md)。
69 changes: 69 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,72 @@ more see [Docs](https://iotdb.apache.org/UserGuide/latest/Basic-Concept/Encoding
[C++](./cpp/README.md)

[Python](./python/README.md)

## Command-Line Tool (tsfile-cli)

Apache TsFile ships `tsfile-cli`, a single, pipe-friendly command-line tool for inspecting
**and** importing `.tsfile` files directly from the shell. Read commands (`ls`, `meta`,
`schema`, `stats`, `count`, `head`, `cat`, `sample`) print to stdout and diagnostics to
stderr, so they compose with `awk`, `jq`, `sort`, and friends; the `write` command imports
CSV/TSV into a new `.tsfile`. Output formats: `csv`, `tsv`, `json` (NDJSON), and `table`.

### Commands

| Command | What it does |
|---|---|
| `ls` | List the tables (table model) or devices (tree model), one name per line |
| `meta` | File summary: data model, table/device/series counts, time range, and file size |
| `schema` | Per-series data type, encoding, and compression |
| `stats` | Per-series statistics: count, time range, min/max, first/last, and sum |
| `count` | Per-series row counts plus a total — read from statistics, without scanning pages |
| `head` | Print the first N rows (default 10; `-n` to change) |
| `cat` | Stream every matching row |
| `sample` | Take a reproducible reservoir sample of rows (`-n`, `--seed`) |
| `write` | Import CSV/TSV into a new table-model `.tsfile` |

The metadata commands (`ls`, `meta`, `schema`, `stats`, `count`) answer most questions
without decoding data pages, while `head`, `cat`, and `sample` read the actual rows.

### Examples

```bash
tsfile-cli ls data.tsfile # list tables / devices
tsfile-cli meta data.tsfile # file overview (model, counts, time range, size)
tsfile-cli head -n 20 data.tsfile # first 20 rows
tsfile-cli cat -m temp,humidity -f csv data.tsfile # stream selected columns as CSV

# import CSV/TSV into a new table-model .tsfile
printf 'time,id1,s1\n0,dev,0\n1,dev,10\n' \
| tsfile-cli write --table t1 --columns "id1:STRING:tag,s1:INT64:field" -o out.tsfile -
```

### Building

> **Platform support.** Building `tsfile-cli` from source is currently supported on **Linux
> and macOS** only. Standalone, pre-built releases of the tool are planned for a later date.

`tsfile-cli` is built together with the C++ module, so building that module with Maven from
the repository root includes it in the build output:

```bash
./mvnw clean package -P with-cpp
```

This produces the executable at `cpp/target/build/bin/tsfile-cli`, alongside the shared
library it depends on, `libtsfile`, under `cpp/target/build/lib/` (`libtsfile.so` on Linux,
`libtsfile.dylib` on macOS). `tsfile-cli` loads `libtsfile` at runtime, so to use it the
library must sit where the dynamic linker can find it — keep it under `cpp/target/build/lib`
and put that directory on the library search path, or copy `libtsfile` next to the binary
(or into a system library directory):

```bash
# Linux
export LD_LIBRARY_PATH=cpp/target/build/lib:$LD_LIBRARY_PATH
# macOS
export DYLD_LIBRARY_PATH=cpp/target/build/lib:$DYLD_LIBRARY_PATH

cpp/target/build/bin/tsfile-cli --version
cpp/target/build/bin/tsfile-cli --help
```

See [`cpp/tools/README.md`](./cpp/tools/README.md) for the full command and option reference.
Loading