Archiveteam Warrior
ArchiveTeam Warrior is a virtual archiving appliance. It will take some of your extra CPU and bandwidth and collect data from various sites (projects of the ArchiveTeam) and send it to the ArchiveTeam. This gets aggregated and added to the Internet Archive to help preserve our digital heritage.
Product: ArchiveTeam Warrior
Install Type: Manifest
Container Image: Docker
Installation Details
I have not as of yet created a Helm chart for this, so I have configured manifest files to install ArchiveTeam Warrior on Kubernetes. These manifests were adapted from these Docker instructions. You should probably read through that page first to understand what is being adapted here.
The following assume you have an existing namespace named utility, an NGINX ingress named nginx-int, and Cert Manager configured to use an ACME provider. Because this control panel has no need to be public this uses my internal CA with Step CA. Please adjust for your particular needs.
00-utility-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: utility
labels:
name: utility
This is optional, but I create the namespace in my builds in case I have to recover from scratch. If it already exists, it will not have any real negative effects
01-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: archiveteam
namespace: utility
labels:
app.kubernetes.io/name: archiveteam
spec:
selector:
matchLabels:
app: archiveteam
template:
metadata:
labels:
app: archiveteam
app.kubernetes.io/name: archiveteam
spec:
containers:
- name: archiveteam
image: atdr.meo.ws/archiveteam/warrior-dockerfile:latest
imagePullPolicy: Always
ports:
- containerPort: 8001
envFrom:
- configMapRef:
name: archiveteam
02-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: archiveteam
namespace: utility
data:
DOWNLOADER: <<Your_ID>>
SELECTED_PROJECT: auto
SHARED_RSYNC_THREADS: '40'
WARRIOR_ID: << Your_ID >>
CONCURRENT_ITEMS: '6'
You should use your own unique ID here. Also please adjust the SHARED_RSYNC_THREADS and CONCURRENT_ITEMS to fir your needs, these values represent the current maximum values
03-service.yaml
apiVersion: v1
kind: Service
metadata:
name: archiveteam
namespace: utility
spec:
selector:
app: archiveteam
ports:
- port: 8001
targetPort: 8001
type: ClusterIP
04-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: archiveteam
namespace: utility
labels:
name: archiveteam
annotations:
cert-manager.io/cluster-issuer: your-issuer
spec:
ingressClassName: your-ingress
rules:
- host: your.host.name
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: archiveteam
port:
number: 8001
tls:
- hosts:
- your.host.name
secretName: archiveteam-int-tls
As mentioned above, the ingress configuration assumes working ngress and cert-manager configurations
Now, we can deploy this all together with:
kubectl apply -f 00-utility-namespace.yaml \
-f 01-deploy.yaml \
-f 02-configmap.yaml \
-f 03-service.yaml \
-f 04-ingress.yaml