161 lines
9 KiB
Markdown
161 lines
9 KiB
Markdown
# seafile_recovery
|
||
|
||
**Quick description:** `seafile_recovery` is a low-level tool that parses Seafile's on-disk file storage.
|
||
Compared to other tools, it works without the associated sqlite or MySQL database.
|
||
|
||
**Some use cases:** I developped this tool because I lost my database and wanted to get back my files.
|
||
It can also help you to diagnose problems with Seafile's on-disk file storage and maybe repair them.
|
||
Finally, it can help you better understand how Seafile is working internally and gather some statistics about your Seafile's repositories health.
|
||
|
||
**Features and limitation:** The tool can parse all commits of a repository, currently the `head` subcommand selects the "last" one according to the commit graph and an heuristic based on time. Each commit contains a `RootId`. All files and folders have an Id in Seafile, the `RootId` is simply the Id of the root folder of the repository at the time of the commit. You can inspect these Ids with `ls`. You can copy a file or a folder hierarchy on your disk with `cp`. Finally the `s3` subcommand directly transfer the file or folder hierarchy to a S3-compatible storage. Currently, the tool does not work with encrypted repositories. Advanced Seafile features are not tested. Finally, the tool has not been extensively tested and may crash when encountering some unusual edge cases.
|
||
|
||
**Disclaimer:** This tool is community made and thus not affiliated to Seafile Ltd., Seafile Gmbh. or any company.
|
||
The development of this tool has been done for my own needs, I can not be held responsible for any issue or damage it can cause.
|
||
Use it carefully or none at all if you are not sure of what you are doing, data are often more precious than we imagine.
|
||
Always shutdown your Seafile daemons before using it (both Seafile and Seahub).
|
||
Create a backup before running any command and double check all your operations.
|
||
|
||
## Installation
|
||
|
||
As a pre-requesite, you need a recent version of [Go](golang.org/).
|
||
|
||
```
|
||
go get git.deuxfleurs.fr/quentin/seafile_recovery
|
||
export PATH="$HOME/go/bin:$PATH"
|
||
seafile_recovery --help
|
||
```
|
||
|
||
## Tutorial
|
||
|
||
Let's suppose you start by knowing nothing about your storage folder and its repositories,
|
||
start by picking one repository ID in the `storage/commits` folder and run the `head` subcommand. For our example,
|
||
we will use `0011d396-4890-463a-8266-bcbd978d8d1c`.
|
||
|
||
```
|
||
$ seafile_recovery head 0011d396-4890-463a-8266-bcbd978d8d1c
|
||
2021/04/28 15:10:34 Repo contains 6 commits
|
||
2021/04/28 15:10:34 Repo has 1 sources
|
||
2021/04/28 15:10:34 Repo has 1 sinks
|
||
2021/04/28 15:10:34 Proposing following HEAD:
|
||
RootId: 5911dd2d363f591e43df4e80591d0a54975f2aaf
|
||
CreatorName: quentin@example.com
|
||
Creator: 0000000000000000000000000000000000000000
|
||
Description: Added "telecom-reclaimed-web-single-page.pdf".
|
||
Ctime: 2021-04-26 12:22:59 +0200 CEST
|
||
RepoName: Ma bibliothèque
|
||
RepoDesc: Ma bibliothèque
|
||
```
|
||
|
||
We know learnt some information about the repository, especially its name ("Ma bibliothèque"), who did the last change ("quentin@example.com") and the RootId ("5911dd2d363f591e43df4e80591d0a54975f2aaf").
|
||
|
||
We can now explore its last file hierarchy thanks to the RootId (we can only copy a part of the Id to keep the command more readable):
|
||
|
||
```
|
||
$ seafile_recovery ls 0011d396-4890-463a-8266-bcbd978d8d1c --dir=5911dd2
|
||
2021/04/28 15:15:40 5911dd /
|
||
2021/04/28 15:15:40 b88ab9 /seafile-tutorial.doc
|
||
2021/04/28 15:15:40 d24616 /Capture d’écran de 2021-04-11 23-07-31.png
|
||
2021/04/28 15:15:40 f123de /My Folder/
|
||
2021/04/28 15:15:40 15be4d /My Folder/telecom-reclaimed-web-single-page.pdf
|
||
2021/04/28 15:15:40 380a0e /My Folder/Capture d’écran vidéo de 19-12-2020 10:30:15.webm
|
||
2021/04/28 15:15:40 Total size: 25.6M
|
||
```
|
||
|
||
Now, let's suppose I want to extract the folder "My Folder" and its content and put it in a folder named `out`:
|
||
|
||
```
|
||
$ seafile_recovery cp 0011d396-4890-463a-8266-bcbd978d8d1c --dir=f123de ./out
|
||
2021/04/28 15:17:28 f123de /
|
||
2021/04/28 15:17:28 15be4d /telecom-reclaimed-web-single-page.pdf
|
||
2021/04/28 15:17:28 380a0e /Capture d’écran vidéo de 19-12-2020 10:30:15.webm
|
||
$ ls out/
|
||
'Capture d’écran vidéo de 19-12-2020 10:30:15.webm' telecom-reclaimed-web-single-page.pdf
|
||
```
|
||
|
||
Finally, if I prefer to upload this content directly on a S3 bucket, you can do:
|
||
|
||
```
|
||
$ seafile_recovery cp 0011d396-4890-463a-8266-bcbd978d8d1c --dir=f123de s3://ACCESS_KEY:SECRET_KEY@ENDPOINT/REGION/BUCKET[/PREFIX]
|
||
2021/04/28 15:17:28 f123de /
|
||
2021/04/28 15:17:28 15be4d /telecom-reclaimed-web-single-page.pdf
|
||
2021/04/28 15:17:28 380a0e /Capture d’écran vidéo de 19-12-2020 10:30:15.webm
|
||
```
|
||
|
||
**Be careful !** This tool is not intended to change your seafile backend from local filesystem to the S3 backend. Migrating to the S3 backend implies to keep Seafile's objects which is a totally different job. Appropriate scripts are available from Seafile's official distribution.
|
||
|
||
## Usage
|
||
|
||
```
|
||
Seafile Recovery.
|
||
|
||
Usage:
|
||
seafile_recovery [--storage=<sto>] head <repoid>
|
||
seafile_recovery [--storage=<sto>] ls <repoid> (--dir=<dirid> | --file=<fileid>)
|
||
seafile_recovery [--storage=<sto>] cp <repoid> (--dir=<dirid> | --file=<fileid>) <dest>
|
||
seafile_recovery [--storage=<sto>] s3 <repoid> (--dir=<dirid> | --file=<pathid>) <dest>
|
||
seafile_recovery s3del <dest>
|
||
seafile_recovery (-h | --help)
|
||
|
||
Options:
|
||
-h --help Show this screen
|
||
--storage=<sto> Set Seafile storage path [default: ./storage]
|
||
--dir=<dirid> Seafile Directory ID, can be obtained from commits as RootID
|
||
--file=<fileid> Seafile File ID, can be obtained through ls
|
||
```
|
||
|
||
## Seafile on-disk storage
|
||
|
||
Seafile sees your filesystem as an entity to store objects having IDs.
|
||
So, all files in Seafile's storage follow the following pattern:
|
||
|
||
```
|
||
.../storage/{commits,fs,blocks}/$repo_id/$obj_id[:2]/$obj_id[2:]
|
||
```
|
||
|
||
The following schema explains how these objects are linked between them and how to read them:
|
||
|
||
```
|
||
storage/commits/(repoid) storage/fs/(repoid) storage/blocks/(repoid)
|
||
(plain text json) (json + zlib) (chunk of raw data)
|
||
|
||
Dir (1) ┌──────────┐
|
||
HEAD ┌──────────┐ root_id ┌──────────┐ ┌─────►│5b/4c09c..│
|
||
(sink)│4f/2fcf9..├───────────────────►│98/ff6e3..│ │ └──────────┘
|
||
└─┬─────┬──┘ └─┬──────┬─┘ │
|
||
parent │ │ 2nd parent │ │DirEnt │ (2) ┌──────────┐
|
||
▼ ▼ │ │ ├─────►│eb/a557a..│
|
||
┌──────────┐ ┌──────────┐ Dir ▼ ▼ File │ └──────────┘
|
||
│21/22f45..├──┤a5/c7325..├───? ┌──────────┐ ┌──────────┐ │
|
||
└────────┬─┘ └─┬────────┘ │9f/31be6..│ │3b/2e671..├──┤ (3) ┌──────────┐
|
||
parent │ │ parent └─────┬──┬─┘ └──────────┘ └─────►│42/1aac0..│
|
||
▼ ▼ │ │DirEnt └──────────┘
|
||
┌──────────┐ │ └──────┐
|
||
│5b/2f24f..├──────────? File ▼ ▼ File (1) ┌──────────┐
|
||
└────┬─────┘ ┌──────────┐ ┌──────────┐ ┌──────►│0b/5c780..│
|
||
│ parent │4a/54b55..│ │ba/557ae..├─┘ └──────────┘
|
||
▼ └───────┬──┘ └──────────┘
|
||
┌──────────┐ │ (1) ┌──────────┐
|
||
Initial│69/ca6b5..├──────────? └─────────────────────────►│67/515ea..│
|
||
└────┬─────┘ (? = not shown) └──────────┘
|
||
│
|
||
X no parent
|
||
```
|
||
|
||
|
||
----
|
||
|
||
## Dev notes
|
||
|
||
Should look how Seafile handles ID collision, it might be one here in a repo with `44592` commits:
|
||
|
||
```
|
||
$ ls -lah
|
||
62684fe2260d67b6b5d2de909c3816feb21c39 bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e ffc4e7f4273c8e4cc57124ccb6d65467c3b6a3
|
||
641064a61de537a696f2172e90be9c8ac4ae04 bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e.8WWPVZ
|
||
$ ls -lah bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e.8WWPVZ
|
||
-rw------- 1 1000 1000 0 Jan 12 2019 bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e.8WWPVZ
|
||
% ls -lah bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e
|
||
-rw------- 1 1000 1000 629 Jan 12 2019 bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e
|
||
```
|
||
|
||
|