Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support portability of db #31

Open
petacz opened this issue Jan 22, 2020 · 6 comments
Open

Support portability of db #31

petacz opened this issue Jan 22, 2020 · 6 comments

Comments

@petacz
Copy link

petacz commented Jan 22, 2020

Hi,
It would be great if the tool would support portability of db.

In my case I have data that I backup to at least 2 drives and I want to verify that all drives contain the same data.
In my case i have them mounted here /mnt/y and /mnt/v
I want to check both drives using the same database but create only hashes from one of them.

So maybe just another argument that would replace /mnt/y with /mnt/v/ prefix in memory would be sufficient.

I will try to look into the code and create pull request if I find time.

Thanks,
Petr

@trapexit
Copy link
Owner

Making scorch store relative paths is possible though you'd then either always have to run the program from the location you are relative from or provide it explicitly which isn't pleasant. As you mention the alternative would be allow for path substitution.

Off hand not sure what the pros and cons are to the two. Substitution is weird in that part of the point of the tool is to keep an index of the system so absolute path is necessary. Then if you indexed both /mnt/y and /mnt/v... how does it behave with this substitution. Does it point to /mnt/v but replace '/mnt/v' with '/mnt/y' for lookup? So if you pointed it to /mnt it'd do two validations? One for the real /mnt/y and one for the substituted /mnt/v? Probably not a problem but just something to consider.

@aw
Copy link

aw commented May 13, 2020

Sorry to hijack this issue, but I ran into a similar issue and tried to fix it by creating a symlink from /mnt/y to /mnt/v .. so the original indexed path exists (although it's a symlink), but it doesn't seem to work.

strace output does show a AT_SYMLINK_NOFOLLOW in the output, but I'm not sure how to fix that in the python code, and the exit code is 16 (despite the file technically being available at said path).

@trapexit
Copy link
Owner

Symlinks are files so it doesn't follow so it can index them. You could use a bind mount.

If paths are relative then there can be dups. The primary key of the existing system has to change and to what would that be?

Is this a permanent or temporary thing? I could make a simple function that replaces strings in the filepath but the way it does it will depend on the actual workflow. Or perhaps I could do something where it compares files to add to the DB to missing files and if the details match up it changes the filepath?

@aw
Copy link

aw commented May 13, 2020

Oh of course! I completely forgot about bind mounts. That works perfectly, thanks!

I prefer this solution, since it doesn't require you to make changes. Also note to @petacz this would solve your issue as well:

mount -o bind /mnt/y /mnt/v

@trapexit
Copy link
Owner

It's not about whether or not a change is needed. That's fine. I just don't know what the intended workflow is. Is it a "I happen to have a mirror of the data elsewhere and want to check it" or "I have moved my paths around and want to rename the filepaths"? Or both?

@aw
Copy link

aw commented May 13, 2020

The use-case is having a mirror of the data elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants