Question About Duplicate Handling

Information and Tutorials on features in Mylar and how to use it
Post Reply
christhegale
Posts: 9
Joined: Wed Apr 13, 2016 3:20 pm

Question About Duplicate Handling

Post by christhegale »

Good Afternoon,

Apologies if this has been asked before - I did some searching and couldn't find anything.

I've recently been running into a bit of trouble with the folder monitoring accidentally overwriting comics that I already had sorted with a different comic that has the same name but is a different version/year. For example, I have the series "Rogue" from 2004 already in my library with everything downloaded and the series set to pause. I recently downloaded another series named "Rogue", but this one was from 1985 and it ended up in the same folder that Mylar monitors for newly downloaded issues from torrents. It looks like what happened was the the folder monitor saw the newly downloaded issues for the other series, thought they were for the series I already had in my library, and because the new files were a larger filesize than replaced some issues from the 2004 series with issues from the 1985 series.

In order to avoid this happening again as I add series into the library, I turned on the Duplicate Dump folder setting in the config, but since the setting for Retain based on is still set to filesize (there isn't an option to say "never retain") I'm not sure how this is going to behave. Now that I've set up the duplicate folder, will any duplicates found be placed in that folder for me to sort out on my own, or will Mylar still use the Retain based on... setting first (which means that this could keep happening) and then only issues that failed the retention check would end up in the duplicate folder? (I hope this makes sense)
christhegale
Posts: 9
Joined: Wed Apr 13, 2016 3:20 pm

Re: Question About Duplicate Handling

Post by christhegale »

P.S. While I'm thinking about it, I'd love a setting somewhere that I could use that basically says that once a series is set to paused it no longer factors in with post-processing anymore.

I'm not really sure what the pause/resume comic setting actually does - I've been using it mainly for organizational purposes so that once a series has Ended and I have all issues in that series I can pause it and visually see that the series is 100% good to go, but it'd be great if once a comic is paused it would no longer look or try to match downloaded comics to those paused series. That would probably alleviate some of the issues I've been running into where Mylar is thinking it's finding duplicates that are actually from different volumes for series that I've already completed.
User avatar
evilhero
Site Admin
Posts: 2883
Joined: Sat Apr 20, 2013 3:43 pm
Contact:

Re: Question About Duplicate Handling

Post by evilhero »

christhegale wrote:Good Afternoon,

Apologies if this has been asked before - I did some searching and couldn't find anything.

I've recently been running into a bit of trouble with the folder monitoring accidentally overwriting comics that I already had sorted with a different comic that has the same name but is a different version/year. For example, I have the series "Rogue" from 2004 already in my library with everything downloaded and the series set to pause. I recently downloaded another series named "Rogue", but this one was from 1985 and it ended up in the same folder that Mylar monitors for newly downloaded issues from torrents. It looks like what happened was the the folder monitor saw the newly downloaded issues for the other series, thought they were for the series I already had in my library, and because the new files were a larger filesize than replaced some issues from the 2004 series with issues from the 1985 series.
Yes, I've seen this happen before with various other series, but usually it's when both series are in the same year. I guess alot of the answers, probably depend on abit more information pertaining to the problem at hand:
- The 'Rogue (1985)' issue - did the series already exist on your watchlist, or was it just an issue downloaded outside of Mylar that happened to be in the same folder that Mylar monitors for downloads?
- If the series did exist on your watchlist, was there anything in the Version Number field within the Edit Settings tab for that particular series within Mylar?
- For the 'Rogue (2004)' series, was there anything present in the Version Number field ?
- What was the filename of the 'Rogue (1985)' issue ?

Mylar does do date checks against the issue that's being post-processed to try and ensure things like this don't happen - but obviously it's limited to the information it has on hand, as well as what's in the name of the file. It will check the store date of the issue and compare it to the year in the filename to ensure things match up. It will also check against the Version Number within Mylar and make sure it matches to what is in the filename if it exists (v2, v2004, etc). Now if the filename didn't have a year, and it didn't have a Version Number (it should really be Volume Number, I need to change that) - then it would most definitely assume the 1985 issue belongs in the 2004 series since all it has to go by then is just the series title itself.
In order to avoid this happening again as I add series into the library, I turned on the Duplicate Dump folder setting in the config, but since the setting for Retain based on is still set to filesize (there isn't an option to say "never retain") I'm not sure how this is going to behave. Now that I've set up the duplicate folder, will any duplicates found be placed in that folder for me to sort out on my own, or will Mylar still use the Retain based on... setting first (which means that this could keep happening) and then only issues that failed the retention check would end up in the duplicate folder? (I hope this makes sense)
If you have the Duplicate Dump folder setup - then any duplicates that fail the Retain check (in your case anything that is a duplicate and is lesser in filesize than the compared against file), will get moved into the Dump folder. If it was set to never retain, it would delete the duplicate - which in your case it would be deleting the wrong file, which is not a good thing. Also, for it to never retain it has to have something to compare against in order to decide if it's to retain it or not (the filesize, cbr/cbz options). The 'never retain' option does exist, you just don't enable the Duplicate Dump Folder option - it will then just replace the file with the one that passed the Retain check (and in doing so deletes the existing file if they're named the same - basically if you have Rename Files enabled).
User avatar
evilhero
Site Admin
Posts: 2883
Joined: Sat Apr 20, 2013 3:43 pm
Contact:

Re: Question About Duplicate Handling

Post by evilhero »

christhegale wrote:P.S. While I'm thinking about it, I'd love a setting somewhere that I could use that basically says that once a series is set to paused it no longer factors in with post-processing anymore.

I'm not really sure what the pause/resume comic setting actually does - I've been using it mainly for organizational purposes so that once a series has Ended and I have all issues in that series I can pause it and visually see that the series is 100% good to go, but it'd be great if once a comic is paused it would no longer look or try to match downloaded comics to those paused series. That would probably alleviate some of the issues I've been running into where Mylar is thinking it's finding duplicates that are actually from different volumes for series that I've already completed.
Heh, missed this one...

That's pretty much exactly what the Pause status is supposed to do, or what the plan was for it when I put in place along time ago. Mylar kind of does this already when using the pull-list, if a watchlist entry is in an Ended status, it won't try to match up to anything on the pull-list, although I don't think it actually looks at the Paused/Active status. Could really just use the Continuing / Ended status and the %completed. If it's in an 'Ended' status and it's at 100% completion, I think it'd be safe to assume that it doesn't need to get referenced during manual post-processing runs (normal post-processing via SAB/NZBget has the exact IssueID passed to it, so it will always hit the proper spot), and it really shouldn't. I think maybe this is one of those situations where both should be used - being Paused, as well as being in an Ended status with 100% completion.

Yes, I like this very much. Expect it soon ;)
christhegale
Posts: 9
Joined: Wed Apr 13, 2016 3:20 pm

Re: Question About Duplicate Handling

Post by christhegale »

- The 'Rogue (1985)' issue - did the series already exist on your watchlist, or was it just an issue downloaded outside of Mylar that happened to be in the same folder that Mylar monitors for downloads?
- If the series did exist on your watchlist, was there anything in the Version Number field within the Edit Settings tab for that particular series within Mylar?
- For the 'Rogue (2004)' series, was there anything present in the Version Number field ?
- What was the filename of the 'Rogue (1985)' issue ?
Turns out the series was actually from 1995, not 1985, my mistake. Been looking at far two many numbers today but still the same basic issue.

- The Rogue (1995) series was not on my watchlist - it's just something I downloaded outside of Mylar that happened to end up in the same folder that Mylar monitors. I had planned to add the series in after it downloaded, but it post-processed thinking it was part of the other series before I got to it.
- N/A - the 1995 series wasn't in my watchlist yet.
- The Rogue (2004) series has a "v1" in the Version Number. This isn't something I put in so I'm guessing it came from comicvine when the series was imported in.
- The exact filenames of the Rogue (1995) issues that incorrectly processed was "Rogue 01 (of 4) (1994) (Digital) (Zone-Empire).cbr", "Rogue 02 (of 4) (1995) (Digital) (Zone-Empire).cbr", etc...
christhegale
Posts: 9
Joined: Wed Apr 13, 2016 3:20 pm

Re: Question About Duplicate Handling

Post by christhegale »

evilhero wrote:
christhegale wrote:P.S. While I'm thinking about it, I'd love a setting somewhere that I could use that basically says that once a series is set to paused it no longer factors in with post-processing anymore.

I'm not really sure what the pause/resume comic setting actually does - I've been using it mainly for organizational purposes so that once a series has Ended and I have all issues in that series I can pause it and visually see that the series is 100% good to go, but it'd be great if once a comic is paused it would no longer look or try to match downloaded comics to those paused series. That would probably alleviate some of the issues I've been running into where Mylar is thinking it's finding duplicates that are actually from different volumes for series that I've already completed.
Heh, missed this one...

That's pretty much exactly what the Pause status is supposed to do, or what the plan was for it when I put in place along time ago. Mylar kind of does this already when using the pull-list, if a watchlist entry is in an Ended status, it won't try to match up to anything on the pull-list, although I don't think it actually looks at the Paused/Active status. Could really just use the Continuing / Ended status and the %completed. If it's in an 'Ended' status and it's at 100% completion, I think it'd be safe to assume that it doesn't need to get referenced during manual post-processing runs (normal post-processing via SAB/NZBget has the exact IssueID passed to it, so it will always hit the proper spot), and it really shouldn't. I think maybe this is one of those situations where both should be used - being Paused, as well as being in an Ended status with 100% completion.

Yes, I like this very much. Expect it soon ;)
That's awesome. I feel like this might resolve a lot of weird little quirks I've noticed as I've been completing older series here and there. Basically the only series I keep unpaused are either series that are currently Continuing or series that have ended but I'm still collecting issues for in order to complete them, so by skipping all the completed/paused series processing should in theory not only go quicker it should also be able to start finding some of the issues Mylar thinks I already have from other volumes that share the same name.
Post Reply