Discussion:
Pictures as symbolic links
Ethan
2011-09-01 08:30:54 UTC
Permalink
Hi,

I was going to file this as a ticket on redmine, but I didn't get a
confirmation email when I registered, so I couldn't.

I know Shotwell supports symbolic links to directories containing images.
Are there any plans to support symbolic links to images in Shotwell? Would
patches be welcome? Any suggestions on implementation direction?

Thanks!

Ethan
oliver
2011-09-01 10:05:00 UTC
Permalink
Hi,


a while ago I talked about using symbolic links.
But I was not accurate enough in picking the terms.
I later explained it: "symbolic links" was not meant
as symbolic links on the filesystem level, which is
what symbolic links are.
I meant: symbolically linking the data into the database,
but not file-by-file; I meant whole directory tries ("repositories"),
while making the database hold inside the according directory-tree.


So, if you want to implement "symbolic links instead of copying"
on the filesystem level, I'm not sure if this really makes sense.
The reason is: the link only works, if the pictures are available.
If not, it's a broken link.
In that case, it would be better to follow the way I mentioned:
if a whole pictrure-directory is not there (e..g. movable media),
then only once it's checked if the whole directory is there,
instead of checking each broken link individually.

Ciao,
Oliver
Ethan
2011-09-01 13:11:04 UTC
Permalink
Post by oliver
a while ago I talked about using symbolic links.
But I was not accurate enough in picking the terms.
I later explained it: "symbolic links" was not meant
as symbolic links on the filesystem level, which is
what symbolic links are.
Let me clarify. What I want *is* support for symbolic links on the
filesystem level. I'm experimenting with using git-annex to store my
picture files, so my picture library consists of symlinks to pictures. The
symlinks have appropriate names like "dscn8517.jpg", and they link to files
like
".git/annex/objects/8x/k7/WORM-s1985206-m1306675740--dscn8517.jpg/WORM-s1985206-m1306675740--dscn8517.jpg".

I'd like these files to be recognized at all when I start shotwell. If the
links are broken, I'm OK with them being marked as "missing". In a perfect
world, if they got shuffled around, I would like them to not be re-imported
as duplicates.

Shotwell currently explicitly doesn't support symbolic links as "images"
(see BatchImport.vala:1444 and DirectoryMonitor.vala:69). I can understand
that it might be complicated to figure out how to treat them; what happens
if a symlink changes target? If a symlink is updated, does this mean it
needs to be rescanned? In my use case I don't care too much about these
questions, but recognizing the files at all would be an important step.
Advice on how to go about implementing this would therefore be appreciated.
I think I can just add cases to both of the above files for symlinks that
provide file metainformation from the linked file and add a FileMonitor on
the linked file (as well as the symlink?) but I haven't actually written any
code yet.

Ethan
oliver
2011-09-01 15:23:23 UTC
Permalink
Post by Ethan
Post by oliver
a while ago I talked about using symbolic links.
But I was not accurate enough in picking the terms.
I later explained it: "symbolic links" was not meant
as symbolic links on the filesystem level, which is
what symbolic links are.
Let me clarify. What I want *is* support for symbolic links on the
filesystem level. I'm experimenting with using git-annex to store my
picture files, so my picture library consists of symlinks to pictures. The
symlinks have appropriate names like "dscn8517.jpg", and they link to files
like
".git/annex/objects/8x/k7/WORM-s1985206-m1306675740--dscn8517.jpg/WORM-s1985206-m1306675740--dscn8517.jpg".
Aha.
I didn't know of git-annex.

I used git as repository for my files, when preparing an exhibition.
The disadvantage is: the files are in the working directory as well as in
the repository, which means: there is at least twice as much disk spce used
as I would need for the pictures... and with every new file-version mo0re space is needed.
(But I have versioning, which migth become very helpful).
After my work was done I removed the .git and saved disk space.
I had the original phpotgraphs elswehere and the work for printing in the
pic-working dir.

I may also look at git-annex.

Not sure if it provides what I'm looking for,
but AFAIK git is written as libraries and interfaces to the user,
so the functionality should be available for own programs.


I tried "-d" switch from shotwell with relative pathnames.
The db-dir was created relative to the $HOME.
I hope it also will work with absolute pathnames, because then i could use
$ shotwell -d <abspath_to_picdir>
which comes close to my attempt with picture-repositories, which contains
the pictures as well as the database.

This would be done if I use $MY_PICTURE_DIR as path to the picture files
as well as for "-d".

I hope the abspath-attempt will work.
Did not tried so far.

If it does, then the only problem is, that there is no meta-view,
which automatically shows me the picture-repositories all in shotwell
overview/menus.
Then I could pick out one or more of those repos, e.g. one is on USB, another is on
changebale HDD under /mnt/pics/ and the rest is on my main HDD somewhere
in $HOME or so.

At the moment one would need to start another shotwell -d <mypiypath>
program to get access to other picture-repos.
Post by Ethan
I'd like these files to be recognized at all when I start shotwell. If the
links are broken, I'm OK with them being marked as "missing". In a perfect
world, if they got shuffled around, I would like them to not be re-imported
as duplicates.
Shotwell does check files on importing.
I tried at least with some jpg-files and it works.
Don't know if it also can handle some 10k or some 100k files
efficiently.
Also I don't know how it compares files to check on equality.
But it seems, for jpeg-files it checks the pure data-part.

But if it finds equally data-dart files, it *might* be fine,
to ask, if other comments parts from the files might be added to the database,
so that pictures with differing comment sections but similar jpg-data
might yield in adding all the found comment parts, so that no comment is
missing.

Would be nice, but is a rather minor feature (nice to have, but not extremely important).
Post by Ethan
Shotwell currently explicitly doesn't support symbolic links as "images"
(see BatchImport.vala:1444 and DirectoryMonitor.vala:69). I can understand
that it might be complicated to figure out how to treat them;
[...]

Symlinks are not that complicated.
But it might need some more syscalls to check that.
=> man 2 stat
=> man 2 lstat



Ciao,
Oliver
oliver
2011-09-01 15:36:04 UTC
Permalink
Post by Ethan
Post by oliver
a while ago I talked about using symbolic links.
But I was not accurate enough in picking the terms.
I later explained it: "symbolic links" was not meant
as symbolic links on the filesystem level, which is
what symbolic links are.
Let me clarify. What I want *is* support for symbolic links on the
filesystem level. I'm experimenting with using git-annex to store my
picture files,
[...]


I found that page:
http://git-annex.branchable.com/

Looks interesting/promising.

I think I will have a closer look at it.

Ciao,
Oliver
Jim Nelson
2011-09-01 22:41:28 UTC
Permalink
When we first started Shotwell, we avoided symlinks because they open up a
number of issues we preferred not have to face up front, such as broken
links, directory loops, and who knows what else. Over time we added some
support: the auto-import and directory monitoring features do support
symlinks for directories (for some specific use cases we felt we needed to
support), but not with files themselves.

We have a ticket to support symlinks completely:
http://redmine.yorba.org/issues/2983 I don't know that we would want to
install FileMonitors for each linked file, however, since there is a hard
limit each process can create. The general strategy we've used in directory
monitoring is to install a single directory monitor and watch the files it
contains.

-- Jim
Post by Ethan
Post by oliver
a while ago I talked about using symbolic links.
But I was not accurate enough in picking the terms.
I later explained it: "symbolic links" was not meant
as symbolic links on the filesystem level, which is
what symbolic links are.
Let me clarify. What I want *is* support for symbolic links on the
filesystem level. I'm experimenting with using git-annex to store my
picture files, so my picture library consists of symlinks to pictures. The
symlinks have appropriate names like "dscn8517.jpg", and they link to files
like
".git/annex/objects/8x/k7/WORM-s1985206-m1306675740--dscn8517.jpg/WORM-s1985206-m1306675740--dscn8517.jpg".
I'd like these files to be recognized at all when I start shotwell. If the
links are broken, I'm OK with them being marked as "missing". In a perfect
world, if they got shuffled around, I would like them to not be re-imported
as duplicates.
Shotwell currently explicitly doesn't support symbolic links as "images"
(see BatchImport.vala:1444 and DirectoryMonitor.vala:69). I can understand
that it might be complicated to figure out how to treat them; what happens
if a symlink changes target? If a symlink is updated, does this mean it
needs to be rescanned? In my use case I don't care too much about these
questions, but recognizing the files at all would be an important step.
Advice on how to go about implementing this would therefore be appreciated.
I think I can just add cases to both of the above files for symlinks that
provide file metainformation from the linked file and add a FileMonitor on
the linked file (as well as the symlink?) but I haven't actually written any
code yet.
Ethan
_______________________________________________
Shotwell mailing list
http://lists.yorba.org/cgi-bin/mailman/listinfo/shotwell
oliver
2011-09-02 10:49:07 UTC
Permalink
Post by Jim Nelson
When we first started Shotwell, we avoided symlinks because they open up a
number of issues we preferred not have to face up front, such as broken
links, directory loops, and who knows what else. Over time we added some
support: the auto-import and directory monitoring features do support
symlinks for directories (for some specific use cases we felt we needed to
support), but not with files themselves.
[...]

If you just use files, without checking, if they are symbolic links,
then you have the problems that goes along with symbolic links, already.
Any file that you import might already be a symbolic link in the
directory you import.

How is/was that handled?
Post by Jim Nelson
http://redmine.yorba.org/issues/2983 I don't know that we would want to
install FileMonitors for each linked file,
Is Filemonitoring already used on regular files?
And: is it file-based?
If so, this would may answer some of the performance issues I had with num of files > 100k.
Post by Jim Nelson
however, since there is a hard
limit each process can create. The general strategy we've used in directory
monitoring is to install a single directory monitor and watch the files it
contains.
[...]

Is this automatically done, or triggered by the user?
Such things as automatisms can strongly reduce performance....
...at Desktop Environments as well as in photo-programs.

Ciao,
Oliver
Jim Nelson
2011-09-02 19:40:45 UTC
Permalink
Post by oliver
If you just use files, without checking, if they are symbolic links,
then you have the problems that goes along with symbolic links, already.
Any file that you import might already be a symbolic link in the
directory you import.
How is/was that handled?
We do check if the directory is a symbolic link before processing it. GLib
has a mechanism where you can determine if a symbolic link to a directory
corresponds to the "real" path to the directory. This is how we avoid
directory loops.


Is Filemonitoring already used on regular files?
Post by oliver
And: is it file-based?
If so, this would may answer some of the performance issues I had with num of files > 100k.
No. We do install a FileMonitor for each directory in your library (usually
~/Pictures), but not for each file. (FileMonitor allows either to be
monitored.) There's still a scalability issue when you have an insane
number of directories in your library. But, if you have 100,000 photos in,
say, 500 directories (not implausible), then only 500 FileMonitor objects
are created, which is reasonable.
Post by oliver
Post by Jim Nelson
however, since there is a hard
limit each process can create. The general strategy we've used in
directory
Post by Jim Nelson
monitoring is to install a single directory monitor and watch the files
it
Post by Jim Nelson
contains.
[...]
Is this automatically done, or triggered by the user?
Such things as automatisms can strongly reduce performance....
...at Desktop Environments as well as in photo-programs.
The FileMonitors are only installed if the "Watch library directory for new
files" is checked in the Preferences dialog. By default this is turned off.

-- Jim
Ethan
2011-09-02 20:01:19 UTC
Permalink
Post by Jim Nelson
When we first started Shotwell, we avoided symlinks because they open up a
number of issues we preferred not have to face up front, such as broken
links, directory loops, and who knows what else. Over time we added some
support: the auto-import and directory monitoring features do support
symlinks for directories (for some specific use cases we felt we needed to
support), but not with files themselves.
I've been messing around with the attached patch which makes the simple
changes to allow Shotwell's monitoring to use symlinked photos and videos.
A similar change could be made to BatchImport.vala:1411 to allow the same
when doing "manual" imports, but I haven't tested it yet. query_for_info
will return a FileInfo that still has file_type SYMBOLIC_LINK for broken
symlinks, so those would still not get imported.

I don't think those changes would allow Shotwell to notice when a symlink
goes bad, but that's more than I need right now.
Post by Jim Nelson
http://redmine.yorba.org/issues/2983 I don't know that we would want to
install FileMonitors for each linked file, however, since there is a hard
limit each process can create. The general strategy we've used in directory
monitoring is to install a single directory monitor and watch the files it
contains.
For my purposes that's good enough. It's true that symbolic links could in
principle point outside the directory tree, and while it would be better if
we could notice changes to them, that could lead to an explosion of monitors
(at worst, one for each symlink if each symlink points to a new directory).

What would be necessary for you to apply a patch like this one? Would you
want symlink support to be utterly bulletproof? Broken symlinks should be
noticed and counted as "missing"? Monitors installed on places where
symlinks point? I'm willing to do the legwork if I know what the
requirements are.

Ethan
Jim Nelson
2011-09-06 23:30:29 UTC
Permalink
Post by Ethan
What would be necessary for you to apply a patch like this one? Would you
want symlink support to be utterly bulletproof? Broken symlinks should be
noticed and counted as "missing"? Monitors installed on places where
symlinks point? I'm willing to do the legwork if I know what the
requirements are.
Utterly bulletproof is always good.

Getting a patch landed right now is difficult because we need to consider
all the use cases for symbolic links, especially the perils and pitfalls.
Monitoring each symlink might be overkill.

What I should've asked before was, why can't you simply create a symlink
from your Pictures directory to the git annex directory? That would solve
this problem right away, I think. Library monitoring should work as well.

I've attached your patch to the ticket with a link to the message, for
future reference: http://redmine.yorba.org/issues/2983

-- Jim
Ethan
2011-09-07 23:04:42 UTC
Permalink
Post by Jim Nelson
What I should've asked before was, why can't you simply create a symlink
from your Pictures directory to the git annex directory? That would solve
this problem right away, I think. Library monitoring should work as well.
That's a good idea but the filenames would be long and ugly instead of the
symlink filenames :) Also I'm thinking in terms of XMP sidecars. I'd want
the sidecars next to the symlinks to be used, and don't want sidecars next
to the symlink targets to be used. But I'll accept that this might be
idiosyncratic to my use case.

I see two or three decisions to be made in regards to symlinked-file
support:

- Should a broken symlink count as missing files? I propose that whatever
you do for symlinked directories, symlinked files should behave the same
way. (The patch I sent you will mark broken symlinks as missing but only on
startup.)

- Should monitors be installed on symlink targets? I think it would be best
if we could but you mentioned a system limit on the number of monitors. I
did a test on my system (Ubuntu natty+oneiric, Linux kernel 3.0.0.9) and
wrote a program that opens monitors on every directory in a directory tree.
I ran it on my home directory and it stopped at 31728 -- this is also the
number of directories found by find -type "d". How do you feel about a
patch to unconditionally install monitors on symlinked files' directories?
If that turns out to cause problems, we can keep those monitors separate and
monitor them only as resources allow.

These are the only perils and pitfalls I can think of. Do you see any I
missed? I think with these changes (and I'll hack them up if you like),
symlinks can be as robust as they can be.

Ethan
Jim Nelson
2011-09-27 21:01:33 UTC
Permalink
Sorry for the late reply, been busy as anything lately and am catching up on
old email.

To be honest, the more I understand what you're trying to do the more it
sounds like a highly particular use-case and not something that would be
used by our more general audience of users. Today I'm more inclined to
support symlinked directories than files because of the headaches symlink
files lead to, such as broken links, or multiple links to the same file
(which, ideally, Shotwell would recognize as the same file, and not merely
duplicates).

Plus, the scope of the problem Shotwell is addressing -- managing thousands,
even tens of thousands of photo files -- is not something I imagine file
links being appropriate for most users.

As far as your test, I don't know if GLib will ever stop creating
FileMonitor objects (by throwing an exception). It may be that you were
able to create all those but many of them were broken, that is, not actually
monitoring anything.

Incidentally, I believe we do monitor symlinked directories, so that won't
be necessary.

-- Jim
Post by Ethan
Post by Jim Nelson
What I should've asked before was, why can't you simply create a symlink
from your Pictures directory to the git annex directory? That would solve
this problem right away, I think. Library monitoring should work as well.
That's a good idea but the filenames would be long and ugly instead of the
symlink filenames :)
Also I'm thinking in terms of XMP sidecars. I'd want the sidecars next to
Post by Ethan
the symlinks to be used, and don't want sidecars next to the symlink targets
to be used. But I'll accept that this might be idiosyncratic to my use
case.
I see two or three decisions to be made in regards to symlinked-file
- Should a broken symlink count as missing files? I propose that whatever
you do for symlinked directories, symlinked files should behave the same
way. (The patch I sent you will mark broken symlinks as missing but only on
startup.)
- Should monitors be installed on symlink targets? I think it would be
best if we could but you mentioned a system limit on the number of
monitors. I did a test on my system (Ubuntu natty+oneiric, Linux kernel
3.0.0.9) and wrote a program that opens monitors on every directory in a
directory tree. I ran it on my home directory and it stopped at 31728 --
this is also the number of directories found by find -type "d". How do you
feel about a patch to unconditionally install monitors on symlinked files'
directories? If that turns out to cause problems, we can keep those
monitors separate and monitor them only as resources allow.
These are the only perils and pitfalls I can think of. Do you see any I
missed? I think with these changes (and I'll hack them up if you like),
symlinks can be as robust as they can be.
Ethan
Continue reading on narkive:
Loading...