Seafile On-Disk File Storage Recovery Tool
Go to file
Quentin 81bf5405e7
Improve README
2021-04-28 15:06:20 +02:00
.gitignore First working S3 test 2021-04-26 19:46:42 +02:00
LICENSE.txt WIP Filesystem 2021-04-22 22:43:44 +02:00
README.md Improve README 2021-04-28 15:06:20 +02:00
checks.go Collect commits 2021-04-21 15:54:50 +02:00
command.go Compute directory size 2021-04-27 16:24:01 +02:00
commit.go First LS version 2021-04-23 16:44:42 +02:00
config.go Compute directory size 2021-04-27 16:24:01 +02:00
copy_walker.go Simplify the command part 2021-04-26 18:39:54 +02:00
fs.go Successfully extracted an image 2021-04-26 11:05:31 +02:00
go.mod Compute directory size 2021-04-27 16:24:01 +02:00
go.sum Compute directory size 2021-04-27 16:24:01 +02:00
ls_walker.go Compute directory size 2021-04-27 16:24:01 +02:00
s3_walker.go Compute directory size 2021-04-27 16:24:01 +02:00
seafile_recovery.go Compute directory size 2021-04-27 16:24:01 +02:00

README.md

seafile_recovery

Quick description: seafile_recovery is a low-level tool that parses Seafile's on-disk file storage. Compared to other tools, it works without the associated sqlite or MySQL database.

Some use cases: I developped this tool because I lost my database and wanted to get back my files. It can also help you to diagnose problems with Seafile's on-disk file storage and maybe repair them. Finally, it can help you better understand how Seafile is working internally and gather some statistics about your Seafile's repositories health.

Features and limitation: The tool can parse all commits of a repository, currently the head subcommand selects the "last" one according to the commit graph and an heuristic based on time. Each commit contains a RootId. All files and folders have an Id in Seafile, the RootId is simply the Id of the root folder of the repository at the time of the commit. You can inspect these Ids with ls. You can copy a file or a folder hierarchy on your disk with cp. Finally the s3 subcommand directly transfer the file or folder hierarchy to a S3-compatible storage. Currently, the tool does not work with encrypted repositories. Advanced Seafile features are not tested. Finally, the tool has not been extensively tested and may crash when encountering some unusual edge cases.

Disclaimer: This tool is community made and thus not affiliated to Seafile Ltd., Seafile Gmbh. or any company. The development of this tool has been done for my own needs, I can not be held responsible for any issue or damage it can cause. Use it carefully or none at all if you are not sure of what you are doing, data are often more precious than we imagine. Always shutdown your Seafile daemons before using it (both Seafile and Seahub). Create a backup before running any command and double check all your operations.

Installation

go get git.deuxfleurs.fr/quentin/seafile_recovery
~/go/bin/seafile_recovery --help

Usage

Seafile Recovery.

Usage:
  seafile_recovery [--storage=<sto>] head <repoid>
  seafile_recovery [--storage=<sto>] ls <repoid> (--dir=<dirid> | --file=<fileid>)
  seafile_recovery [--storage=<sto>] cp <repoid> (--dir=<dirid> | --file=<fileid>) <dest>
  seafile_recovery [--storage=<sto>] s3 <repoid> (--dir=<dirid> | --file=<pathid>) <dest>
  seafile_recovery s3del <dest>
  seafile_recovery (-h | --help)

Options:
  -h --help        Show this screen
  --storage=<sto>  Set Seafile storage path [default: ./storage]
  --dir=<dirid>    Seafile Directory ID, can be obtained from commits as RootID
  --file=<fileid>  Seafile File ID, can be obtained through ls

Seafile on-disk storage

 storage/commits/(repoid)           storage/fs/(repoid)          storage/blocks/(repoid)
    (plain text json)                  (json + zlib)               (chunk of raw data)

                                        Dir                    (1) ┌──────────┐
  HEAD ┌──────────┐    root_id         ┌──────────┐         ┌─────►│5b/4c09c..│
 (sink)│4f/2fcf9..├───────────────────►│98/ff6e3..│         │      └──────────┘
       └─┬─────┬──┘                    └─┬──────┬─┘         │
  parent │     │ 2nd parent              │      │DirEnt     │  (2) ┌──────────┐
         ▼     ▼                         │      │           ├─────►│eb/a557a..│
┌──────────┐  ┌──────────┐       Dir     ▼      ▼    File   │      └──────────┘
│21/22f45..├──┤a5/c7325..├───?  ┌──────────┐  ┌──────────┐  │
└────────┬─┘  └─┬────────┘      │9f/31be6..│  │3b/2e671..├──┤  (3) ┌──────────┐
  parent │      │ parent        └─────┬──┬─┘  └──────────┘  └─────►│42/1aac0..│
         ▼      ▼                     │  │DirEnt                   └──────────┘
       ┌──────────┐                   │  └──────┐
       │5b/2f24f..├──────────?   File ▼         ▼     File     (1) ┌──────────┐
       └────┬─────┘             ┌──────────┐  ┌──────────┐ ┌──────►│0b/5c780..│
            │  parent           │4a/54b55..│  │ba/557ae..├─┘       └──────────┘
            ▼                   └───────┬──┘  └──────────┘
       ┌──────────┐                     │                      (1) ┌──────────┐
Initial│69/ca6b5..├──────────?          └─────────────────────────►│67/515ea..│
       └────┬─────┘ (? = not shown)                                └──────────┘
            │
            X no parent

Tutorial


Dev notes

Should look how Seafile handles ID collision, it might be one here in a repo with 44592 commits:

$ ls -lah
62684fe2260d67b6b5d2de909c3816feb21c39	bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e	       ffc4e7f4273c8e4cc57124ccb6d65467c3b6a3
641064a61de537a696f2172e90be9c8ac4ae04	bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e.8WWPVZ
$ ls -lah bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e.8WWPVZ
-rw------- 1 1000 1000 0 Jan 12  2019 bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e.8WWPVZ
% ls -lah bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e       
-rw------- 1 1000 1000 629 Jan 12  2019 bd8d7b2df788bf8bb6efc87ddb52c6f595ea7e