Skip to content

Commit

Permalink
[doc] Document remove_orphan_files whole database
Browse files Browse the repository at this point in the history
  • Loading branch information
JingsongLi committed Jul 5, 2024
1 parent d3612e6 commit 4a89dd9
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 9 deletions.
5 changes: 3 additions & 2 deletions docs/content/flink/procedures.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,14 +179,15 @@ All available procedures are listed below.
</td>
<td>
To remove the orphan data files and metadata files. Arguments:
<li>identifier: the target table identifier. Cannot be empty.</li>
<li>identifier: the target table identifier. Cannot be empty, you can use database_name.* to clean whole database.</li>
<li>olderThan: to avoid deleting newly written files, this procedure only
deletes orphan files older than 1 day by default. This argument can modify the interval.
</li>
<li>dryRun: when true, view only orphan files, don't actually remove files. Default is false.</li>
</td>
<td>CALL remove_orphan_files('default.T', '2023-10-31 12:00:00')<br/><br/>
CALL remove_orphan_files('default.T', '2023-10-31 12:00:00', true)
CALL remove_orphan_files('default.*', '2023-10-31 12:00:00')<br/><br/>
CALL remove_orphan_files('default.T', '2023-10-31 12:00:00', true)
</td>
</tr>
<tr>
Expand Down
16 changes: 10 additions & 6 deletions docs/content/maintenance/manage-snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,15 @@ submit a `remove_orphan_files` job to clean them:
{{< tabs "remove_orphan_files" >}}
{{< tab "Flink" >}}
{{< tab "Spark SQL/Flink SQL" >}}
```sql
CALL sys.remove_orphan_files(table => "my_db.my_table", [older_than => "2023-10-31 12:00:00"])

CALL sys.remove_orphan_files(table => "my_db.*", [older_than => "2023-10-31 12:00:00"])
```
{{< /tab >}}
{{< tab "Flink Action" >}}
```bash
<FLINK_HOME>/bin/flink run \
Expand All @@ -322,12 +330,8 @@ To avoid deleting files that are newly added by other writing jobs, this action
--older_than '2023-10-31 12:00:00'
```
{{< /tab >}}
The table can be `*` to clean all tables in the database.
{{< tab "Spark" >}}
```sql
CALL sys.remove_orphan_files(table => "tableId", [older_than => "2023-10-31 12:00:00"])
```
{{< /tab >}}
{{< /tabs >}}
3 changes: 2 additions & 1 deletion docs/content/spark/procedures.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,12 +129,13 @@ This section introduce all available spark procedures about paimon.
<td>remove_orphan_files</td>
<td>
To remove the orphan data files and metadata files. Arguments:
<li>table: the target table identifier. Cannot be empty.</li>
<li>table: the target table identifier. Cannot be empty, you can use database_name.* to clean whole database.</li>
<li>older_than: to avoid deleting newly written files, this procedure only deletes orphan files older than 1 day by default. This argument can modify the interval.</li>
<li>dry_run: when true, view only orphan files, don't actually remove files. Default is false.</li>
</td>
<td>
CALL sys.remove_orphan_files(table => 'default.T', older_than => '2023-10-31 12:00:00')<br/><br/>
CALL sys.remove_orphan_files(table => 'default.*', older_than => '2023-10-31 12:00:00')<br/><br/>
CALL sys.remove_orphan_files(table => 'default.T', older_than => '2023-10-31 12:00:00', dry_run => true)
</td>
</tr>
Expand Down

0 comments on commit 4a89dd9

Please sign in to comment.