Search order

Post any problems / bugs / issues that are Mylar-related in here.
Post Reply
leaderdog
Posts: 377
Joined: Sun Apr 26, 2015 1:52 pm

Search order

Post by leaderdog »

Hi Evilhero,

I finally got Mylar setup on a server pc now. I started with openmediavault, and while it is an amazing for file share, my gross lack on debian/linux caused me to have to reformat the pc 3 times because I managed to mess up the os when trying to install programs. Blindguy on here did help me get Mylar installed, he was the only person that did help on the omv forum; seemed the admins/mods there just ignored my question, so thanks again blindguy for helping. But Mylar still wasn't searching correctly and the non stop issues with "permissions" was driving me insane. I gave up on that, I now have a windows 7 box that does everything that omv did and the programs actually work. So Mylar now runs 24 hours. So far it did find a few issues from my wanted list, but I'm not sure how it will go with the new releases. It is a fresh install and the only thing I copied over was the mylar.db and the cache. You had posted in a different thread how to "update/force Mylar to read the new location for the files" and that worked beautifully.

I decided to turn on NZB search at startup. I figured maybe it would encourage Mylar to search for the new releases. But this is what happens:
1) Mylar began it's search from the very top of the wanted list. -- I was rather hoping it would start from the bottom. I have over 300+ issues that are on the list. Many that aren't scanned/released or never posted. So that's a lot of issues to run through at 1 minute plus scan time before it gets any where near the bottom.

2) I changed the log size because it wasn't logging enough and restarted mylar. -- Mylar began the nzb search again, but it started right back at the top rather than continuing from where it left off.

Two suggestions that I think might make this work a bit better, if it's already implemented then I have something wrong with my Mylar setup.

1) NZB search should remember/log the last searched file in the order of the wanted list, and continue from there.

2) 'This week' releases should be on a separate wanted list so it overrides the long term wanted list so that it checks for those first. Chances are like mine the big wanted list are issues that haven't been scanned/released so constantly checking those before the newer content is sort of working backwards. So essentially for the week of the release, the "this week" issues should always be searched first then worry about the long list of not likely theres. Especially on restart. So restart would search the new releases, then resume from the last known searched issue from the long wanted list.

So right now I'm not sure if Mylar is going to download new files or not. I'll let it go for the day, it didn't find anything overnight and the releases started showing up on usenet-crawler last evening. But I don't think it made it through my huge wanted list and since it started over. It looks like it will be a minimum of 5 hours before it even makes it through the list, and I'm not sure how long the search time is. So add that time to the top of that 5 hours.

Here's another question that I'm not sure how works and wondering if it has something to do with why Mylar never searches for any new releases. in the clean-newreleases.txt every title has None Skipped after it:

Code: Select all

2015-09-16	DC COMICS	4	BIZARRO	None	Skipped
2015-09-16	DC COMICS	4	BLACK CANARY	None	Skipped
2015-09-16	DC COMICS	4	CONSTANTINE THE HELLBLAZER	None	Skipped
2015-09-16	DC COMICS	4	DOCTOR FATE	None	Skipped
2015-09-16	DC COMICS	4	DOOMED	None	Skipped
2015-09-16	DC COMICS	TP	FLASH SEASON ZERO	None	Skipped
2015-09-16	DC COMICS	4	GREEN LANTERN: THE LOST ARMY,	None	Skipped
2015-09-16	DC COMICS	20	HARLEY QUINN	None	Skipped
2015-09-16	DC COMICS	10	INJUSTICE GODS AMONG US YEAR FOUR	None	Skipped
2015-09-16	DC COMICS	4	MARTIAN MANHUNTER	None	Skipped
2015-09-16	DC COMICS	TP	NEW TEEN TITANS	VOL 03	Skipped
2015-09-16	DC COMICS	4	PREZ	None	Skipped
2015-09-16	DC COMICS	4	ROBIN SON OF BATMAN	None	Skipped
2015-09-16	DC COMICS	6	SECRET SIX	None	Skipped
2015-09-16	DC COMICS	14	SENSATION COMICS FEATURING WONDER WOMAN	None	Skipped
2015-09-16	DC COMICS	21	SUPERMAN WONDER WOMAN	None	Skipped
2015-09-16	DC COMICS	HC	SWAMP THING BY SCOTT SNYDER DLX ED	None	Skipped
2015-09-16	DC COMICS	44	WONDER WOMAN	None	Skipped
All the non TP and HC are set as wanted in my list, but these say skipped. Not sure if that has anything to do with anything, but thought I'd ask ;)

That file was from last week. I had removed this and the newreleases.txt from the cache folder because, thought it might interfere, and it rewrote the two files a couple days later, the new releases.txt is repopulated, but the clean-newreleases.txt has no content inside, it's 0kbs.

Oh I did turn off rss, I kept checking back and it did not seem there were any comic releases on the rss. If the rss on the site was set up in a way where it was broken down in categories so they didn't all run together then it would be worth while I think, but having movies, tv, books, comics, games, dirty content, etc all in one list it just fills the list to fast to be useful.

ugh I hope that was semi structured so it was easy enough to follow. ;)

Thanks for all your work! If Mylar already does what I'm suggesting please explain to me how, since mine is definitely not working like this.
User avatar
evilhero
Site Admin
Posts: 2887
Joined: Sat Apr 20, 2013 3:43 pm
Contact:

Re: Search order

Post by evilhero »

*Phew* that's alot to think about early in the mornin eh ;) I'll try and answer your questions in a break-down/answer format so it'll be easier for us both to follow:
I decided to turn on NZB search at startup. I figured maybe it would encourage Mylar to search for the new releases. But this is what happens:
1) Mylar began it's search from the very top of the wanted list. -- I was rather hoping it would start from the bottom. I have over 300+ issues that are on the list. Many that aren't scanned/released or never posted. So that's a lot of issues to run through at 1 minute plus scan time before it gets any where near the bottom.
Yeah that's probably right - although what it's supposed to do is start from the most recent releases and work it's way in a most recent-least recent order. How you have your Wanted list sorted (if you manually re-sorted it by clicking on a column header for example), doesn't play into any factor for when Mylar does it's searching. This is actually very high on my to-do for this exact reason. Having 100+ issues you're searching for will take forever (and subsequently cause some other overlapping issue problems - namely downloading some issues twice depending on the size of your list). My goal is to have it use the RSS feeds only when doing searching, and if you were to click on a Force Search, or manually searched (magnifying glass icon) it would actually do an rss + api backlog search. Pretty much what Sonarr does - it only monitors RSS feeds on a continual basis, with the logic that if an api search has been done already, then there's no need to keep on checking the api for more results if the rss feed is being monitored as well - since you're doing the exact same search with the exact same results. I'm hoping to get this into the development branch within the next week or so.
2) I changed the log size because it wasn't logging enough and restarted mylar. -- Mylar began the nzb search again, but it started right back at the top rather than continuing from where it left off.
1) NZB search should remember/log the last searched file in the order of the wanted list, and continue from there.
Yes, there is a lot of information that gets logged into those log files - alot of which can be trimmed down, but until the logging gets cleaned up it's a necessary evil (better to have too much logging, then not enough). I have my log files set for 5mb, which seems to be a decent tradeoff - but again, if you can't afford to spare 25mb - 50mb (5 files * 5mb or 5 files * 10mb) for some log files, you probably shouldn't be downloading things ;)

The problem with Mylar remembering the last item searched for is that the Wanted list doesn't remain static. It will change, items will get added / removed over the course of time, so if item E is the last item in the list that was searched for (in a hypothetical list of A-Z), it would start at F on next search. But then 3 new items get added so that everything gets shuffled down by 3, so then the starting point becomes I. It then 'misses' the most recent adds of A-C (the 3 new items since it goes from most recent to least). That and also that Mylar doesn't have any record of the last time that a search was done for a given issue - it would require a new field to be added into the Issues table (not a big deal), but the underlying logistics of the above workflow would be abit of work. Nothing impossible, just would take a bit of time.
Oh I did turn off rss, I kept checking back and it did not seem there were any comic releases on the rss. If the rss on the site was set up in a way where it was broken down in categories so they didn't all run together then it would be worth while I think, but having movies, tv, books, comics, games, dirty content, etc all in one list it just fills the list to fast to be useful.
This.

Don't turn off RSS.

That's your saving grace and what will speed things up considerably and get you past the auto-searching every 6hours and wasting countless api hits that are totally unnecessary. Mylar automatically only checks the 'comics' rss feed, it doesn't pay attention to any other feed - almost every newznab-based site follows the same general formatting for rss (save for a few, like nzb.su, dog, and omgwtfnzbs as an example), so you don't need to specify anything other than your api key and your UID (the 'xxxxx' in the 'i=xxxxx' in the rss feed for your provider) - Mylar handles the rest.

Any new releases will hit the RSS feed and since it gets monitored every 20 minutes by default, you're not gonna miss anything new (unless you turn off the machine obviously) so your weekly pull list will get monitored much more efficiently. RSS monitoring doesn't cost you any API hits, nor if it's set to every 20 minutes has no impact on your provider (setting it to less might get you into some trouble - it depends on the provider, but usually 20 minutes is the unspoken time for most users regardless of it being Mylar or something else).If you missed the particular portion of the feed that had some issues on it, then you'd have to wait for the 6hour search to start up, or a Force Search was manually initiated.

Plus as an added bonus, Mylar caches all the rss entries it retrieves - so when it goes looking for items, the longer you have Mylar running and the RSS is being used, the bigger backlog you can pull from without actually doing any api hits - thereby increasing the speed of searches substantially, and lowering your usage of your provider's api. Since when Mylar searches, it does it in the following order: current rss feed, db rss feed (cache), api.
All the non TP and HC are set as wanted in my list, but these say skipped. Not sure if that has anything to do with anything, but thought I'd ask
That's normal behaviour, Mylar assigns a default status of 'Skipped' when it's initially pulling down the list. It then loads the list (clean-releases.txt), and then changes the status' accordingly as they are matched up to your watchlist. Once it's done it wipes the clean-releases file contents so that it can get written to again (it does a sequential write to the txt file, so if it wasn't wiped it would have previous weeks within and would get confusing atm). The newreleases.txt is the actual file that's pulled down in-tact (or if you have alt_pull enabled in the config.ini, Mylar parses the webpage instead of getting the file from the new releases site - usually this means that the pull-list will show new releases a few days early - like on Saturday (alt_pull=1), vs the typical on Mondays (alt_pull=0).

Hope that explains things abit better - basically, make sure your Usenet-Crawler information in mylar has the correct UID. Then save, and restart Mylar. The RSS will get searched on startup if it's been > 20 minutes since the last rss request (otherwise you can use the Force RSS option on the Search Providers tab within Mylar to force an rss check). You should notice things start downloading more frequently thereafter since the rss is monitored :)

If you have any other questions, or want me to explain more - just ask. Aside from watching kids shows and drinking some coffee, I usually have the time to respond.
leaderdog
Posts: 377
Joined: Sun Apr 26, 2015 1:52 pm

Re: Search order

Post by leaderdog »

Hi Evilhero,

As always, thanks for the quick response.

Well Shut my mouth! I turned rss back on and it is now downloading!!!

I always had rss on, and it was set correctly but it didn't want to work. It even showed rss info in the logs, but nothing would download. I turned rss back on, restarted Mylar and away she went downloading the new releases.

Yay and thanks! ;)

Right now my log file in the mylar folder isn't populating. It's showing no updates for the past hour, Mylar in program log is showing info. I'll have to figure that one out later, I have to run for now. But it's now downloading. So I'm happy. haha

Thanks again :)
leaderdog
Posts: 377
Joined: Sun Apr 26, 2015 1:52 pm

Re: Search order

Post by leaderdog »

Hi Evilhero

It appears to be downloading correctly now. So I'm guessing something funny happened with the rss where it didn't save correctly. After I turned it off ->saved -> restarted then turned it back on ->saved -> restarted it worked the way it was intended.

Also it did download everything twice, or tried to. I caught it removed them, then had to manually go in and change them to downloaded. The only way I could find to do that was to go to the wanted page and select each one that was snatched and then change it to downloaded. Even when I went in to each titles page it and hit recheck, it would most of the time still return it as snatched, even tho the file was clearly in the folder.

I read this thread here https://github.com/evilhero/mylar/issues/1134 same issue I had above. Curious as I mentioned earlier if it's possible to have two download priorities, the wanted, which is the secondary download, and the primary which is the new issues. Thus it would primarily look at the new files for download, and then it stays out of the larger wanted list until the week passes then what ever was left that didn't get downloaded move over to the secondary downloads.

Not sure if that is tough to do, but for me that seems like the easiest way to keep them separated.

Also, off topic, any chance in the comic page to have the directory where the comic is open up that folder? I guess that might be specific to windows... or I guess any linux with a gui, but it would make it a little quicker to get to the folder to verify if the files are in there or not. Plus add or remove as needed too.

Thanks again,
:)
Post Reply