moving object storage to seaweedfs

back then, fedinet.waltuh.cyou depends on eu2.contabostorage.com, or the contabo object storage for it's media uploads. initially we found it to be quite useful, but

the media is gone

the media is gone by itself

several weeks later, it also affected me

i also noticed that it's not just one of these photos that was actually gone, it's 4-7 of them. these are only noticeable after i restarted varnish cache (which effectively starting the cache from 0).

recovery is pain in the ass. for public post, they're easy to recovery by obtaining a copy from remote instance's CDN cache, but for some non-public post, these are hard to recover than you thought, unless you're also following them on your alt account.

initially, this problem happens to one of the post from my friend, Irfan, which i apparently noticed that the media that's on the post became 404. o thought it was temporary, but then slowly it's showing by itself that it also affected other users

i contacted the contabo support, but they're not even aware of it too

There are no deletion API being triggered via S3 in the backend that i hosted. It was just an upload, and then several hours later, gone by itself.

Dear Lee,

We would like to clarify that there are no deletion actions being triggered from our side in the backend. What about any lifecycle policy? Didn't have any that expires object after sometime? Please monitor it from now on and inform us if it occurs again.

let's check the akkoma source code, on this one: https://akkoma.dev/AkkomaGang/akkoma/src/commit/f3b39e9ea25bda9bb2e7611f6025499a3cc51c1d/lib/pleroma/uploaders/s3.ex#L34-L38

if we check how's the upload was done:

ExAws.S3.upload(bucket, s3_name, [
  {:acl, :public_read},
  {:content_type, upload.content_type}
])

it's basically just that.

but then, we have attachment cleanup worker, and yet that also doesn't seems going to break anything. i also checked other code that's relying on delete_file(), and yet nothing suspicious.

so that leaves me in a dead end.

time to do our own solution, i guess

hi. seaweedfs

seaweedfs logo

to be frank, previous setup on fedi.lecturify.net years ago uses minio for the s3 storage. even after that, there's still something that bothers me, yet, also don't like it at the same time.

so i was looking around on object-storage tag on github explore and then stumbled on seaweedfs that promise to have O(1). there's also rustfs, bht due to time constraints, i decided that it's not worth compiling it on the server. so i decide to compile seaweedfs instead, which only took 3 mins to finish.

right after i finished compiling it, i install it to /usr/local/bin/, and then made a openrc unit on the container:

~ # cat /etc/init.d/weed
#!/sbin/openrc-run

name="seaweedfs"
description="SeaweedFS service"

supervisor="supervise-daemon"
command="/usr/local/bin/weed"
command_background="no"
command_user="weed"
command_args="server -s3.externalUrl=https://objstorage.waltuh.cyou -s3 -dir /home/weed/data -volume.max 12"

no_new_privs="yes"

pidfile="/run/${RC_SVCNAME}/${RC_SVCNAME}.pid"

retry="SIGTERM/60/SIGKILL/5"

error_log="/var/log/${RC_SVCNAME}.log"
output_log="/var/log/${RC_SVCNAME}.log"

depend() {
  need net
}

start_pre() {
  checkpath --directory --owner $command_user:$command_user --mode 0755 /run/${RC_SVCNAME}
  checkpath --file --owner $command_user:$command_user --mode 0644 $error_log
}

that's what i did.

for the bucket and user management (including key), i summon weed admin and then access the admin UI locally via ssh port forwarding. I manage everything here.

as usual, i begin by syncing everything from the contabo storage via aws-cli before finally syncing back to our own bucket that's now in seaweedfs. things went surprisingly smooth.

except on one particular part

every object was inaccessible to anonymous access by default

to be frank, seaweedfs was a weird one. their admin web UI has no bucket specific policy.

even with their policies menu, it seems barely doing anything even after i apply it to the bucket owner. so i delete that.

when checking around, i found out that if you craft this public-policy.json:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::fedi/*"
    }
  ]
}

and then apply it via aws cli:

$ aws s3api put-bucket-policy --bucket fedi --policy file://public-policy.json

then it finally allows public object access. strange? i think so.

even after i apply the policy via aws-cli, the admin UI barely changed anything. no mention about bucket policy.

i guess that's it that i can tell you

still curious? i will give you the current server mood

+---------------+-------------+-----------+------+
| INSTANCE NAME | CPU TIME(S) |  MEMORY   | DISK |
+---------------+-------------+-----------+------+
| akkoma        | 5013.68     | 395.22MiB |      |
+---------------+-------------+-----------+------+
| mariadb       | 27.82       | 138.09MiB |      |
+---------------+-------------+-----------+------+
| mediaproxyoma | 409.46      | 86.34MiB  |      |
+---------------+-------------+-----------+------+
| pg            | 1945.70     | 193.20MiB |      |
+---------------+-------------+-----------+------+
| s3            | 41146.94    | 358.01MiB |      |
+---------------+-------------+-----------+------+
| toys          | 3739.83     | 86.48MiB  |      |
+---------------+-------------+-----------+------+
| varnish       | 59.29       | 43.76MiB  |      |
+---------------+-------------+-----------+------+
| writefreely   | 36.93       | 37.92MiB  |      |
+---------------+-------------+-----------+------+
Press 'd' + ENTER to change delay
Press 's' + ENTER to change sorting method
Press CTRL-C to exit

Delay: 10s
Sorting Method: Alphabetical

disk:

yonle@waltuh:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       296G   44G  240G  16% /

ram & cpu & load (htop): htop of waltuh.cyou

alright. that's it. bye.