(STORM-550) LLDir::getNextFileInDir fails for some complex wildcard combinations
Review Request #32 - Created Dec. 17, 2010 and submitted
Seth ProductEngine | Reviewers | ||
viewer | |||
STORM-477 | |||
None | viewer-development |
- Re-implemented LLDir::getNextFileInDir() as an iterator object. - Added a class implementing directory entries iteration with pattern matching which is used in unit tests instead of LLDir::getNextFileInDir. - Fixed LLDir unit test which failed for some complex wildcard combinations. - Replaced all existing usages of LLDir::getNextFileInDir() with the new directory iterator object. - Removed platform specific LLDir::getNextFileInDir() implementation.
Posted (Dec. 17, 2010, 1:21 p.m.)
-
indra/llvfs/lldiriterator.h (Diff revision 1) -
This documentation is pretty good; it would be better if it were done in a way that was more doxygen compatible (putting the documentation of the methods on the methods). The class should have a definition of the glob syntax and matching rules.
-
indra/llvfs/lldiriterator.cpp (Diff revision 1) -
The translation from glob to regex does not allow for escaped glob characters. Suppose I wanted to find a set of file names that contained a '?' character - the glob would need to escape that character as in "*\?*". I don't know if we actually care passionately about that case, but...
-
indra/llvfs/lldiriterator.cpp (Diff revision 1) -
This translation of the '*' glob character may be too general. Most globbing will not match a leading '.' character in the file name (so the glob "*foo" will match the files "afoo" and "bfoo", but not ".foo"). This has the nice side effect of preventing any glob from matching the '.' and '..' directory links in unix-like file systems (which the iterator user should not have to deal with).
-
indra/llvfs/lldiriterator.cpp (Diff revision 1) -
What happens here if braces are not matched? I believe what you'll get is an invalid regexp that will fail later. Probably better to catch it here where a reasonable diagnostic can be produced. If braces != 0 at the bottom (or if it becomes < 0 the '}' case), then the input glob expression was invalid.
-
indra/llvfs/tests/lldir_test.cpp (Diff revision 1) -
If we're going to support the braces matching, test cases should be added for them.
Review request changed
Updated (Dec. 20, 2010, 1:47 p.m.)
-
- added Diff r2
- Added supported glob wildcards matching description. - Fixed leading '*' translation to regex to avoid matching leading '.' in file names. - Fixed '?' escaping. - Added incorrect braces usage handling. Added related test cases. - Fixed character ranges complementation with '!'. Added related test cases.
Posted (Dec. 23, 2010, 7:21 a.m.)
As noted below, it would be more efficient to do the translation from the glob expression just once in the constructor rather than on each call to the next method. The constructor should then throw an exception for an invalid glob expression.
-
indra/llvfs/lldiriterator.h (Diff revision 2) -
Why not use '^' for the negation qualifier here? It's more familiar (at least to unix users)
-
indra/llvfs/lldiriterator.cpp (Diff revision 2) -
It would be better to make the boost::regex a member variable - rather than translate the mask to a regex on every call to next, translate it once in the constructor.
-
indra/llvfs/lldiriterator.cpp (Diff revision 2) -
There should be a unit test for this translation.
-
indra/llvfs/lldiriterator.cpp (Diff revision 2) -
This method of tracking escapes and brackets isn't quite correct, I think... consider translating a string input containing an escaped backslash or an escaped bracket.
Review request changed
Updated (Jan. 4, 2011, 12:38 p.m.)
-
- added Diff r3
- Added a boost::regex member variable not to translate the mask to a regex on every call to next(). - Moved iterator initialization and exceptions handling to the constructor.
Posted (Jan. 5, 2011, 11:34 a.m.)
Just one policy question with this... This implementation uses llwarns for various errors in the glob expression. Given that I expect that the glob expression will normally (always?) be hard coded (I hope no one will accept a pattern as input from the user and pass it on unverified), should these use llerrs so that the error cannot be missed (it crashes)? Other than that, this looks good now.
Review request changed
Updated (Jan. 11, 2011, 8:27 a.m.)
-
- added Diff r4
-
Fixed LLDir unit test which failed for some complex wildcard combinations. Added a class implementing directory entries iteration with pattern matching which is used in unit tests instead of LLDir::getNextFileInDir. This code has been run on Linux only. It should be tested under other platforms and more test cases should be provided. For example changing directory contents while iterating through it.
- Re-implemented LLDir::getNextFileInDir() as an iterator object. - Added a class implementing directory entries iteration with pattern matching which is used in unit tests instead of LLDir::getNextFileInDir. - Fixed LLDir unit test which failed for some complex wildcard combinations. - Replaced all existing usages of LLDir::getNextFileInDir() with the new directory iterator object. - Removed platform specific LLDir::getNextFileInDir() implementation.
- Changed llwarns with llerrs in LLDirIterator and removed 2 unit tests that started to cause llerrs. - Replaced all existing usages of LLDir::getNextFileInDir() with the new directory iterator object. - Removed platform specific LLDir::getNextFileInDir() implementation.
Other reviews