Service for letter/PDF archival

pimeys@lemmy.nauk.io · 11 months ago

Service for letter/PDF archival

zzzz@lemmy.world · 11 months ago

https://docs.paperless-ngx.com/

Nomad64@lemmy.world · 11 months ago

I second Paperless NGX. I have been using it for a few years, and it has been working great!

angelsomething@lemmy.one · 11 months ago

This is the correct and only answer :)

tofubl@discuss.tchncs.de · 11 months ago

Paperless-NGX

This ticks all your boxes. It’s really good.

tofubl@discuss.tchncs.de · edit-2 11 months ago

The killer feature for me is my networked scanner scanning directly to the paperless consume samba share and the documents just popping up in the inbox fully OCRd and pre-categorised. Pretty magical.

NB, the docs make it sound like a proper DB is optional, but it’s really not. Performance was iffy for me with sqlite but is rock solid with Postgres.

pimeys@lemmy.nauk.io · 11 months ago

This was it for me now, installed paperless-xng, set it up to scan my email folders, copied all random PDFs from my “organized” tax folder and scanned the rest.

Too bad I just happen to have that Brother printer/scanner without SMB or FTP support. So I need to go through the process of scanning on my computer first, then uploading.

timbuck2themoon@sh.itjust.works · 11 months ago

If it can email you can send it to an email address and paperless can automatically grab it and archive it.

pimeys@lemmy.nauk.io · 11 months ago

I’ve been digging into the settings of this printer and, sadly the only send it can do is as a fax… It’s the entry model, been serving us for years very nicely. It even connects to the internet, but misses features such as email, smb or ftp. For me this looks like something an open source firmware could fix. It has enough processing power to possibly run a lightweight Linux distribution, so installing one that would enable modern communication protocols doesn’t seem impossible.

Kata1yst@kbin.social · 11 months ago

I’ve had excellent luck with Docspell. https://github.com/eikek/docspell

nutshell7827@lemmy.world · 11 months ago

I love docspell.

ddh@lemmy.sdf.org · 11 months ago

Have a look at Paperless.

rentar42@kbin.social · 11 months ago

Note that just because everything is digital doesn’t mean something like that isn’t necessary: If you depend on your service provider to keep all of your records then you will be out of luck once they … stop liking you, go out of business, have a technical malfunction, decide they no longer want to keep any records older than X years, …

So even in a all-digital world I’d still keep all the PDF artifacts in something like that.

And I also second the suggestion of paperless-ngx (even though I’m not using it for very long yet, but it’s working great so far).

pimeys@lemmy.nauk.io · 11 months ago

Of course. My setup now is a Proxmox server + a NAS. What I’m planning to do is to install a service for this to Proxmox, then have the files synced over NFS to the NAS, which then backs them up every night to Backblaze. And of course I need to have the paper copies too, but to be able to search, tag and archive the documents is great when you need to remember a thing X that was mentioned in a paper I got back in 2014.

Padook@feddit.nl · 11 months ago

I’ve been pretty happy with paperless-ngx, it should tick all your boxes

TCB13@lemmy.world · edit-2 3 months ago

Removed by mod

SciPiTie @iusearchlinux.fyi · 11 months ago

The three letters OCR, tagging, fuzzy search and ease of use are the ones for me.

I never needed the date for a letter but quite often its context for example.

Your suggestion just digitalizes physical folders. If that’s enough for you ok - but you’re missing out.

TCB13@lemmy.world · edit-2 3 months ago

Removed by mod

SciPiTie @iusearchlinux.fyi · 11 months ago

For me it was a few hours wrapping my head around how paperless ngx works and its setup. I had a folder structure as you described already on my Nextcloud so I just configured paperless to observe it for new files.

Where I spent more time then reasonable with was the tagging - you can automate it based on… Well everything.

Now I just let it suggest me tags based on my existing documents plus add a NEW tag to the ones I’ve never reviewed. That’s just a reminder for me though to review tags when searching, I don’t actively re tag new uploads.

If you have a docker environment I suggest just pulling a container up3, throwing all your documents in it and see if it would save you time or cost you time. Would be an hour well spent!personally the OCR alone is it worth it for me - my country still loves paper letters and being able to copy text out of that is awesome (IBAN, account numbers, etc - all the stuff that’s suspectible to typos).

TCB13@lemmy.world · edit-2 3 months ago

Removed by mod

SciPiTie @iusearchlinux.fyi · 11 months ago

Worst case I have all my OCRed documents as raw files which I can migrate to whereever.

Files still exist. For my case encrypted as well. My backups roll on the data, not the container.

But I’m not trying to convince you, I tried answering the questions :)

And two answer your last question clearly: I survived before paperless, I’d get along without it. I find a new tool to mitigate my manual labor as good as possible - if that’s not possible then jo harm done. I know I’m flexible, I can learn new tools and I’m never vendor or tool locked-in. I have a high level of self confidence when it comes to my tool chain and how I’d adapt any part of it - from password manager to cloud storage and my mail flow.

To be honest I couldn’t self host anything if I’d had the fear of being lost if a tool is discontinued.

TCB13@lemmy.world · edit-2 3 months ago

Removed by mod

mellitiger@iusearchlinux.fyi · 11 months ago

My way of using paperless-ngx includes an automatic export to plain pdf-files which are synced via syncthing.

Everything is accessible with a normal filesystem and over the keepass-gui…

SciPiTie @iusearchlinux.fyi · 11 months ago

Ahh g I don’t use paperless as an exclusive document storage but as a pure manager. It searches and tags but doesn’t have exclusivity over any files but it’s own indices!

It doesn’t provide more value than jellyfin in that regard - make it visible and accessible.

Moonrise2473@feddit.it · 11 months ago

Good luck finding stuff when you have 1000+ files and you only search in the file name