k8s部署doccano开源文本标注工具
doccano
- GitHub:
- 版本:v.1.7.0
镜像构建:
- 下载v1.7.0版本代码
git clone -b v.17.0 https://github.com/doccano/doccano.git
- 思路
阅读docker/docker-compose.prod.yaml文件,我们知道doccano项目有五个服务,分别是doccano_backend、doccano_fronted、doccano_celery、postgres、rabbitmq。只需要构建前三个镜像即可。
- 构建doccano_backend 镜像
# 在doccano文件夹下执行
docker build -t doccano_backend:prod -f docker/Dockerfile.prod .
- 构建doccano_celery 镜像
阅读分析docker-compose.prod.yaml文件,我们知道doccano_celery用的是同一个Dockerfile.prod文件,只是入口不同,doccano_celery 的入口是entrypoint: ["/opt/bin/prod-celery.sh"];即只需要将Dockerfile.prod最后一行替换就行。我复制了一份再替换,文件名为Dockerfile.prod-celery,避免文件混乱。
# 在doccano文件夹下面执行
docker build -t doccano_celery:prod -f docker/Dockerfile.celery .
- 构建doccano_fronted
# 在doccano文件夹下面执行
docker build -t doccano_frontend:prod -f docker/Dockerfile.nginx .
- 填坑
在国内编译很慢,可以在google 云服务器上编译。
- 如果在谷歌云服务器上构建,还需要将镜像导出,并导入自己的服务器
# 导出
docker save doccano_backend:prod -o doccano_backend-prod.tar
# 导入
docker load -i doccano_backend-prod.tar
# 其他镜像类似
k8s yaml文件
在本文中,我把postgres数据库服务拆分出来单独部署。剩下四个服务部署在一个pod中,通过服务发现的方式访问postgres数据库。
postgres数据库Yaml文件
apiVersion: v1
kind: Namespace
metadata:
labels:
app: postgres
name: postgres
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: postgres-pv-volume
namespace: postgres
labels:
type: local
app: postgres
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/opt/data/postgres"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgres-pv-claim
namespace: postgres
labels:
app: postgres
spec:
storageClassName: manual
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
- postgres pod service 部署Yaml文件
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment
namespace: postgres
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: postgres
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:11.7
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
envFrom:
- configMapRef:
name: postgres-config
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgredb
volumes:
- name: postgredb
persistentVolumeClaim:
claimName: postgres-pv-claim
---
apiVersion: v1
kind: Service
metadata:
name: postgres-service
namespace: postgres
labels:
app: postgres
spec:
type: NodePort
ports:
- port: 5432
targetPort: 5432
protocol: TCP
selector:
app: postgres
doccano 部署Yaml文件
apiVersion: v1
kind: Namespace
metadata:
labels:
app: doccano
name: doccano
---
apiVersion: v1
kind: ConfigMap
metadata:
name: doccano-config
namespace: doccano
labels:
app: doccano
data:
ADMIN_USERNAME: admin
ADMIN_PASSWORD: password
ADMIN_EMAIL: admin@example.com
RABBITMQ_DEFAULT_USER: doccano
RABBITMQ_DEFAULT_PASS: doccano
RABBITMQ_HOST: 127.0.0.1
POSTGRES_DB: doccano
POSTGRES_USER: postgresadmin
POSTGRES_PASSWORD: admin12345
# 通过服务发现的方式访问postgres 数据库,原理:postgres-service.default.svc.cluster.local
# 是postgres-service的域名,5432是postgres-service服务在k8s集群内部暴露的端口
# 如果postgres service 部署方式更改,则这里也要更改。原理详情见k8s service
POSTGRES_HOST: postgres-service.default.svc.cluster.local
POSTGRES_PORT: "5432"
DOCCANO_BACKEND_HOST: 127.0.0.1
DOCCANO_BACKEND_PORT: "8000"
ALLOW_SIGNUP: "False"
DEBUG: "False"
PYTHONUNBUFFERED: "1"
GOOGLE_TRACKING_ID: ""
- doccano-pod部署Yaml文件
# doccano-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: doccano
namespace: doccano
labels:
app: doccano
spec:
containers:
- name: doccano-backend
image: harbor.com:5080/prod/doccano_backend:prod
env:
- name: ADMIN_USERNAME
valueFrom:
configMapKeyRef:
name: doccano-config
key: ADMIN_USERNAME
- name: ADMIN_PASSWORD
valueFrom:
configMapKeyRef:
name: doccano-config
key: ADMIN_PASSWORD
- name: ADMIN_EMAIL
valueFrom:
configMapKeyRef:
name: doccano-config
key: ADMIN_EMAIL
- name: RABBITMQ_DEFAULT_USER
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_DEFAULT_USER
- name: RABBITMQ_DEFAULT_PASS
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_DEFAULT_PASS
- name: RABBITMQ_HOST
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_HOST
- name: CELERY_BROKER_URL
value: "amqp://$(RABBITMQ_DEFAULT_USER):$(RABBITMQ_DEFAULT_PASS)@$(RABBITMQ_HOST)"
- name: POSTGRES_USER
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_USER
- name: POSTGRES_PASSWORD
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_PASSWORD
- name: POSTGRES_HOST
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_HOST
- name: POSTGRES_PORT
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_PORT
- name: POSTGRES_DB
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_DB
- name: DATABASE_URL
value: "postgres://$(POSTGRES_USER):$(POSTGRES_PASSWORD)@$(POSTGRES_HOST):$(POSTGRES_PORT)/$(POSTGRES_DB)?sslmode=disable"
- name: ALLOW_SIGNUP
valueFrom:
configMapKeyRef:
name: doccano-config
key: ALLOW_SIGNUP
- name: DEBUG
valueFrom:
configMapKeyRef:
name: doccano-config
key: DEBUG
- name: DJANGO_SETTINGS_MODULE
value: config.settings.production
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8000
volumeMounts:
- mountPath: /backend/staticfiles
name: static-volume
- mountPath: /backend/media
name: media
- mountPath: /backend/filepond-temp-uploads
name: tmp-file
- name: doccano-celery
image: harbor.com:5080/prod/doccano_celery:prod
env:
- name: PYTHONUNBUFFERED
valueFrom:
configMapKeyRef:
name: doccano-config
key: PYTHONUNBUFFERED
- name: RABBITMQ_DEFAULT_USER
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_DEFAULT_USER
- name: RABBITMQ_DEFAULT_PASS
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_DEFAULT_PASS
- name: RABBITMQ_HOST
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_HOST
- name: CELERY_BROKER_URL
value: "amqp://$(RABBITMQ_DEFAULT_USER):$(RABBITMQ_DEFAULT_PASS)@$(RABBITMQ_HOST)"
- name: POSTGRES_USER
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_USER
- name: POSTGRES_PASSWORD
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_PASSWORD
- name: POSTGRES_HOST
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_HOST
- name: POSTGRES_PORT
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_PORT
- name: POSTGRES_DB
valueFrom:
configMapKeyRef:
name: doccano-config
key: POSTGRES_DB
- name: DATABASE_URL
value: "postgres://$(POSTGRES_USER):$(POSTGRES_PASSWORD)@$(POSTGRES_HOST):$(POSTGRES_PORT)/$(POSTGRES_DB)?sslmode=disable"
- name: DJANGO_SETTINGS_MODULE
value: config.settings.production
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /backend/media
name: media
- mountPath: /backend/filepond-temp-uploads
name: tmp-file
- name: doccano-frontend
image: harbor.com:5080/prod/doccano_frontend:prod
env:
- name: DOCCANO_BACKEND_HOST
valueFrom:
configMapKeyRef:
name: doccano-config
key: DOCCANO_BACKEND_HOST
- name: DOCCANO_BACKEND_PORT
valueFrom:
configMapKeyRef:
name: doccano-config
key: DOCCANO_BACKEND_PORT
- name: API_URL
value: "http://$(DOCCANO_BACKEND_HOST):$(DOCCANO_BACKEND_PORT)"
- name: GOOGLE_TRACKING_ID
valueFrom:
configMapKeyRef:
name: doccano-config
key: GOOGLE_TRACKING_ID
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
volumeMounts:
- mountPath: /static
name: static-volume
- mountPath: /media
name: media
- name: doccano-rabbitmq
image: rabbitmq:3.8
env:
- name: RABBITMQ_DEFAULT_USER
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_DEFAULT_USER
- name: RABBITMQ_DEFAULT_PASS
valueFrom:
configMapKeyRef:
name: doccano-config
key: RABBITMQ_DEFAULT_PASS
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5672
restartPolicy: Always
hostAliases:
- ip: "127.0.0.1"
hostnames:
- "backend"
volumes:
- name: static-volume
emptyDir: {}
- name: media
emptyDir: {}
- name: tmp-file
emptyDir: {}
- doccano-service 部署Yaml文件
# doccano-service.yaml
apiVersion: v1
kind: Service
metadata:
name: doccano
namespace: doccano
labels:
app: doccano
spec:
type: NodePort
ports:
- port: 8080
targetPort: 8080
protocol: TCP
selector:
app: doccano
部署
- 执行Yaml文件
# 部署
kubectl apply -f xxx.yaml
- 查看服务运行
kubectl get pods,svc -n doccano
- 访问doccano
service/doccano 的8080:30053端口是doccano对外的访问端口,通过k8s集群的任意一个IP + 30053端口都可以访问
- 我构建好的镜像以及Yaml文件,仅供参考