btw the hoarder project is an active victim of a patent troll[0][1]; the official Firefox extension is currently blocked by dmca[2]. any donations might be helpful.
Set this up a couple weeks using an proxmox lxc script and have it using ollama to create tags. I hadn’t heard of singlefile before. That seems like an excellent pairing.
In my experience, LXC uses much fewer resources than VMs so I typically prefer LXC over VM. But in all honesty, I just use whichever is available at https://community-scripts.github.io/ProxmoxVE/
Big thing know is that depending on your gpu, a vm will want to reserve it, making it unavailable to your lxc’s. But lxc’s can share a gpu. There might be some setup you can do with certain cards to create a vgpu to allow vim share, but that’s a headache I didn’t want to go down after getting my nvidia drivers setup on host and shared to lxc. Use the tools that regular jack posted and /r/selfhosted and r/proxmox are good resources. ChatGPT is pretty well versed on this stuff as well.
Talking about hoarding, LTO tapes are the king of cheap storage, but if you want to archive significant amounts (hundreds of TB or more), it takes a significant investment to buy a tape library with somewhat recent drive. Too bad there aren't any alternatives - or are there?
Yeah, that's why i wrote that you need a tape library so you change 8 tapes at a time.
If you have LTO-7, writing 8*6TB = 48 TB before having to change tapes sounds pretty good.
And as I found out the drives are tempermental. I had a tape library and eventually both drives said they'd needed cleaning, even after cleaning. When it worked it was great, though a cheap NAS with a couple of hard drives in it replaced it and was far more reliable and cheaper.
How long do they last, and what will you do when they stop making tapes and equipment to read them?
I ask because I came from a generation with a lot of tapes (reels, cassettes, 8-track, Betamax, VHS, etc.). Cassettes are coming back a little, but not much. I know long-term storage still uses tapes, but I wonder for how long. What happens when we run out of the resources to make them? Is there no better and safer long-term media that is affordable? A magnetic event could wipe them all.
I think you're good for 20 years or so if you store the tapes well. Pretty much all of the industry is using LTO tapes so i don't see them going away soon.
Didn't realize Hoarder now supports SingleFile extension. amazing.
Regarding Hoarder - by selfhosting Hoarder , I was able to cancel my $40/year subscription to Pocket. With the money saved - I added $10 of OpenAI's API credits and use gpt-4o-mini for tagging. I don't have a powerful enough GPU to selfhost Ollama on my NAS where I'm hosting Hoarder. But gpt-4o-mini is dirt cheap for these type of use cases.
Worth noting that Linkding (what the author migrated from to Hoarder) also now supports page archiving via headless Chrome + SingleFile and also via manual upload: https://linkding.link/archiving/
That's what single file is for. Hoarder fetches the webpage using it's own browser, single file makes a copy using your browser including any sessions, then sends that to hoarder.
btw the hoarder project is an active victim of a patent troll[0][1]; the official Firefox extension is currently blocked by dmca[2]. any donations might be helpful.
[0]: https://github.com/hoarder-app/hoarder/commit/b2c795ccb562c0...
[1]: https://www.reddit.com/r/selfhosted/s/CMCPP7cc8i
[2]: https://github.com/hoarder-app/hoarder/issues/899
Set this up a couple weeks using an proxmox lxc script and have it using ollama to create tags. I hadn’t heard of singlefile before. That seems like an excellent pairing.
I recently got started with Proxmox too. Any thoughts/recommendations on running such tools in lxc, vs in a proxmox VM that has docker?
In my experience, LXC uses much fewer resources than VMs so I typically prefer LXC over VM. But in all honesty, I just use whichever is available at https://community-scripts.github.io/ProxmoxVE/
Big thing know is that depending on your gpu, a vm will want to reserve it, making it unavailable to your lxc’s. But lxc’s can share a gpu. There might be some setup you can do with certain cards to create a vgpu to allow vim share, but that’s a headache I didn’t want to go down after getting my nvidia drivers setup on host and shared to lxc. Use the tools that regular jack posted and /r/selfhosted and r/proxmox are good resources. ChatGPT is pretty well versed on this stuff as well.
Thoughts on this vs something like ArchiveBox?
No really the same goal. In Hoarder, the goal is to tag and make content easily searchable. The cached part is a plus, not the main goal.
Actually, it's good but not an cached archive, its a just a cached zen mode version of the webpage (or full file if it is a PDF, EPUB, ...).
Talking about hoarding, LTO tapes are the king of cheap storage, but if you want to archive significant amounts (hundreds of TB or more), it takes a significant investment to buy a tape library with somewhat recent drive. Too bad there aren't any alternatives - or are there?
Tapes are really crap for home use though. They're expensive, super noisy. You constantly have to change them during backing up.
What I do now is use a whole box full of older harddrives that I replaced in my NAS. And I basically use them as tapes with a change frame.
Yeah, that's why i wrote that you need a tape library so you change 8 tapes at a time. If you have LTO-7, writing 8*6TB = 48 TB before having to change tapes sounds pretty good.
Hm yeah but those tapes, they're not really a lot cheaper than a HDD of that capacity. And a tape library is a very expensive, huge and noisy.
And as I found out the drives are tempermental. I had a tape library and eventually both drives said they'd needed cleaning, even after cleaning. When it worked it was great, though a cheap NAS with a couple of hard drives in it replaced it and was far more reliable and cheaper.
Here i'm seeing the cheapest HDD at a cost of 15€/TB and LTO-9 tape at 4.72€/TB. That's more than a 3x difference.
How long do they last, and what will you do when they stop making tapes and equipment to read them?
I ask because I came from a generation with a lot of tapes (reels, cassettes, 8-track, Betamax, VHS, etc.). Cassettes are coming back a little, but not much. I know long-term storage still uses tapes, but I wonder for how long. What happens when we run out of the resources to make them? Is there no better and safer long-term media that is affordable? A magnetic event could wipe them all.
Tapes are still being actively developed for archiving by companies like Fuji, Sony and IBM. They’re not going away any time soon.
And if a magnetic event is strong enough to wipe all your tapes you probably have bigger problems on your hands than a fried backup.
I think you're good for 20 years or so if you store the tapes well. Pretty much all of the industry is using LTO tapes so i don't see them going away soon.
Didn't realize Hoarder now supports SingleFile extension. amazing.
Regarding Hoarder - by selfhosting Hoarder , I was able to cancel my $40/year subscription to Pocket. With the money saved - I added $10 of OpenAI's API credits and use gpt-4o-mini for tagging. I don't have a powerful enough GPU to selfhost Ollama on my NAS where I'm hosting Hoarder. But gpt-4o-mini is dirt cheap for these type of use cases.
Worth noting that Linkding (what the author migrated from to Hoarder) also now supports page archiving via headless Chrome + SingleFile and also via manual upload: https://linkding.link/archiving/
Can Hoarder archive a webpage protected by some kind of auth / login?
That's what single file is for. Hoarder fetches the webpage using it's own browser, single file makes a copy using your browser including any sessions, then sends that to hoarder.
sounds promising! thanks, i’ll look into this.