Scrub persistence decode error #520

Closed
opened 2023-03-06 15:17:24 +00:00 by jpds · 4 comments
Contributor

I appear to have missed a migration step in #516 - after deploying to my nodes, they hit:

ERROR garage_util::persister: Unable to decode persisted data file /meta/scrub_info

last-complete is epoch, but next-run is fine. But I can't see where in the code I can fix this or is it just a case of waiting for the next run and everything will be fine after that?

I deleted the file, and restarted the server - it doesn't get recreated, which is the bigger problem if the next-run is constantly pushed into the future.

I appear to have missed a migration step in #516 - after deploying to my nodes, they hit: ``` ERROR garage_util::persister: Unable to decode persisted data file /meta/scrub_info ``` `last-complete` is epoch, but `next-run` is fine. But I can't see where in the code I can fix this or is it just a case of waiting for the next run and everything will be fine after that? I deleted the file, and restarted the server - it doesn't get recreated, which is the bigger problem if the `next-run` is constantly pushed into the future.
Owner

I think we don't want to lose info, especially the corruption counter, so we need to make a migration. Here is an example of a datastructure that supports a migration between versions:

https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/main/src/model/key_table.rs

Basically, this is what happens :

  • the old version is kept in a mod named as the current version name or the version in which it was introduced (it's in a mod but not in a separate file, everything is in the same file, see example)
  • the declaration of the new version we are making is put in a mod named as the upcoming version
  • the old declaration implements garage_util::migrate::InitialFormat
  • the new declaration implements garage_util::migrate::Migrate, where it declares Previous (the struct of the previous version) and migrate (the function that transforms the old struct into the new struct)
  • there is a pub use vxxx::* that makes the new declaration directly usable in all the code (where xxx can be changed to match the latest version when more migrations are made)

Would you like to try doing this for the ScrubWorkerPersisted data structure, or should I do it myself?

I deleted the file, and restarted the server - it doesn't get recreated, which is the bigger problem if the next-run is constantly pushed into the future.

For this one, I think you can just call persister.save() once when the scrub worker is created

I think we don't want to lose info, especially the corruption counter, so we need to make a migration. Here is an example of a datastructure that supports a migration between versions: https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/main/src/model/key_table.rs Basically, this is what happens : - the old version is kept in a mod named as the current version name or the version in which it was introduced (it's in a mod but not in a separate file, everything is in the same file, see example) - the declaration of the new version we are making is put in a mod named as the upcoming version - the old declaration implements `garage_util::migrate::InitialFormat` - the new declaration implements `garage_util::migrate::Migrate`, where it declares `Previous` (the struct of the previous version) and `migrate` (the function that transforms the old struct into the new struct) - there is a `pub use vxxx::*` that makes the new declaration directly usable in all the code (where xxx can be changed to match the latest version when more migrations are made) Would you like to try doing this for the `ScrubWorkerPersisted` data structure, or should I do it myself? > I deleted the file, and restarted the server - it doesn't get recreated, which is the bigger problem if the next-run is constantly pushed into the future. For this one, I think you can just call `persister.save()` once when the scrub worker is created
Author
Contributor

For this one, I think you can just call persister.save() once when the scrub worker is created

I cannot find a way to easy call this as this is a method of Persister and every part of the code seems to instead use PersisterShared. What should I do?

> For this one, I think you can just call persister.save() once when the scrub worker is created I cannot find a way to easy call this as this is a method of `Persister` and every part of the code seems to instead use `PersisterShared`. What should I do?
Owner

As a hack I think you can do persister.set_with(|_| ());, that should do for now

As a hack I think you can do `persister.set_with(|_| ());`, that should do for now
Owner
I'd add it here : https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/main/src/block/manager.rs#L140
lx closed this issue 2023-03-10 13:25:02 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Deuxfleurs/garage#520
No description provided.