We also store fonts, as well as image subtitles (VOBsub/PGS), which take up a fair amount of space. For simplicity, these all go through the same system as text subtitles.
I just checked the stats: attachments currently consume 34GB (compressed) of space. Uncompressed, this would be 192GB. So the ratio is a bit higher than your estimate. Note that the attachments system does de-duplicate files, e.g. only ever stores one copy of a font, so there's no efficiency to gain there.
On the server side, compression provides three main benefits: cost savings related to disk space, as well as savings in bandwidth usage, and just easier data handling. For the former, it means 2*158GB less disk consumption (*2 due to the fact that I have a single data replica; I currently don't actually have enough backups of this data, so it may be more in the future), and this figure would increase over time. I don't see bandwidth usage being significant, so won't say much about it. As for the last point, smaller data sets make recovery and setting up a new server quicker.
For visitors, it means faster downloads which consume less bandwidth. It's probably not much for many ASS files, but can be useful for the All Attachments link.
Whilst there's a lot of benefits for me to have files stored compressed, I don't mind serving uncompressed copies of subtitles, but I don't have anything which can do this. If you (or someone else) can write up an nginx module similar to mod_gunzip, but for .xz files, I can install it for serving decompressed subtitles.
You can also try sorting by file type to get the files together. Alternatively, if 7z.exe is available in your PATH environment, you could type something like cmd /k 7z x *.xz into the address bar to extract all .xz files.
As for single files, after downloading, if 7-Zip is associated with .xz files, you should just be able to click on the downloaded item, which opens it in 7-Zip, then double-click the file to open in Agesiub. This saves you from having to navigate to a folder and right-click -> extract.
I get that this does require an extra step somewhere, and thanks for raising that, but it is useful for those on slower and/or metered connections, as it reduces the amount needed to download. I don't mind providing the option for an uncompressed download, however I don't have any code/program which can do this. If you can (or you can get someone else to) write an nginx module similar to mod_gunzip but decompresses .xz files, I can install it to provide uncompressed subtitles.
Is it more accurate? Since it only concerns the the sub file and the rest all the same, its like storageOverTime = (x + w)t changed into storageOverTimeNew = (x + 4w)t, where w is sub file size, t is time, x is the rest like new files and stuff. But w is very small, like less than 100 kb. The extra storage being paid for is unlikely to be in the order of 100's of KBs, right?
Let's say there are 100 files a day and 365 days a year, that's 36500 files a year. So each year, only (25*4KB)*36500 files a year which is 3.65GB more a year.
It would be more accurate to say that someone would have to pay for 4x the storage on an ever increasing database over several years for no compelling reason.
Well if capacity is an issue, then can't be helped I guess. Uncompressed is just 4 times bigger file size though, so you would only run out of capacity 4 times faster (unless I am missing something?).
That is true. However, I am asking if it's possible to eliminate even that 1 step for increased efficiency. And batch extract is good if it was easy to select a lot of files. But when you have a big folder and the sub files are all for different series and/or groups, there is considerable space between each file to select, extract all and then open every file. That is, unless you do a sort by date modified or something right after downloading the archives, but that's an extra step as well.
Hitagi is your mother and you didn't do your homework... Hachikuji likes to walk you to school... Hanekawa is your babysitter and you're allergic to cats. Kanbaru likes to play with you on the monkey bars... Yotsugi is in your toybox... Kaiki is in charge of your college savings fund... The Fire Aunts heard you were being bullied at school...
Our children will read /watch Koyomi's child (ren)'s adventures, written by Nisio's child(ren). (?LOL) Whilst they're all in Kindergarten (or grade schoolers...)
Not cheap, but have to extract every file which is 1 extra step. Like I download it, and it shows in download bar in Chrome or something but then can't open the sub file in Aegisub directly. Have to go to the folder and right-click to extract first, then open the extracted file.
admin recently made a comment about Direct files. Many of them will not be available until further notice. (Probably somewhere between weeks and never.)
Sorry if I wasn't clear. I was trying to address a potential concern regarding the size by suggesting that the files just be left not embedded into the MKV, e.g. put in the Extras folder or similar.
I'm not saying that you should do this. I don't actually think the added size is a concern, it was just merely an idea.
Thank you so much for really going out of your way to be as accommodating as possible. I think the effort put in clearly shows, not to mention the amount of information you've given is nothing but phenomenal. I don't think I'd ever go to such lengths to do what you've done.
Just make sure for nyaa, you have to use http://nyaa.tracker.wf:7777/announce as the main tracker.
Comment in Feedback 11/08/2017 20:29 — Anonymous: "YukinoAi"
The alternative to providing them is not to OCR them, it's to not provide them at all.
1-17 are softsubed by Aniki-Team and are provided in ASS format. 18-25 have no softsubs available and the only copy I could obtain was a hardsubbed version by Hakuhodzo no Sensei.
Most Log Horizon releases do not include any foreign scripts at all, however I have gone out of my way to include them in an image-based softsub format (.sup) because I figured Czech viewers would appreciate having some subtitles for 18-25, even if they can be distracting at times, over no subtitles at all, or subtitles only for 1-17. OCR was not a viable option for reasons described above.
They are fine sometimes, distracting at other times. If you would like to make them better, I will happily update the image-based softsubs for accurate text-based softsubs if you would care to provide them. Otherwise, I am done with this project and intend to move on to Season 2. Incidentally, Season 2 has all 25 episodes softsubs in Czech, but German only has hardsubs available.
Unrelated: Thanks for providing such a wonderful service!
Comment in Feedback 11/08/2017 16:58 * — Tarte_aux_quetsches
Sure! It may take some time since I'm quite busy at the moment, but I'll do it in the coming days, you can count on me.
Also, this'll be my first time uploading a torrent, so please forgive me if I make some mistakes (I can be such a noob sometimes -_-').
@anyNOmice: Don't worry, I don't mind if it's avi or mkv. I'm not even picky about video quality (I'm currently watching Candy Candy with the old VHS rip and it doesn't bother me at all), or if it's softsubs or hardsubs. I just avoid HEVC since it's a bit to stressfull for my computer. Anyway, thank you very much!
It's DVDrip, so it really doesn't make a difference whether the container is avi or mkv. I wrote this, because recently so many picky users appeared on AT.
13/08/2017 19:12 — RamenSub