작성일:

trino(presto), minio docker 활용 테스트 환경

trino (hive connector) to minio

docker-compose (trino, minio)

version: '3'
services:
  trino:
    image: trinodb/trino
    container_name: trino
    restart: always
    ports:
      - "8080:8080"
    volumes:
      - ./etc:/etc/trino

  minio:
    image: minio/minio
    restart: always
    command: server /data --console-address ":9001"
    container_name: minio
    environment:
      MINIO_ROOT_USER: minio
      MINIO_ROOT_PASSWORD: minio1234
    restart: always
    ports:
      - "9000:9000"
      - "9001:9001"
    volumes:
      - ./minio-data:/data

minio (hive) properties

  • etc/catalog/hive.properties
  • metastore 는 minio catalog 버킷에 저장 s3a://catalog/trino/
  • minio 세팅 정보 : access-key, secret-key, endpoint
connector.name=hive-hadoop2
hive.metastore=file
hive.metastore.catalog.dir=s3a://catalog/trino/

hive.recursive-directories=true
hive.non-managed-table-writes-enabled=true
hive.allow-drop-table=true

hive.s3.path-style-access=true
hive.s3.ssl.enabled=false
hive.s3select-pushdown.enabled=true

hive.s3.aws-access-key=minio
hive.s3.aws-secret-key=minio1234
hive.s3.endpoint=http://minio:9000

docker-compose 실행 및 초기값 세팅

## 실행 
$ docker-compose up -d 
reating network "trino_default" with the default driver
Creating minio ... done
Creating trino ... done

rclone 으로 minio 버킷 생성

# rclone 설정
$ cat local.conf
[local]
type = s3
provider = Minio
env_auth = false
access_key_id = minio
secret_access_key = minio1234
endpoint = http://localhost:9000

# create catalog, data directory 
$ rclone --config local.conf mkdir local:catalog
$ rclone --config local.conf mkdir local:data
# sample json (ndjson) data)
$ cat 1.json
{"id":1,"name":"Alice"}
{"id":2,"name":"Bob"}
{"id":3,"name":"Carol"}

# test data copy
$ rclone --config local.conf copy 1.json local:data/sample/

schema, table 생성

  • default schema (db) 생성 및 external_location table 생성
-- hive 의 schema(db) 생성 
create schema hive.default;

-- table 생성 
create table hive.default.sample (
  id varchar,
  name varchar
)
with (
  format = 'json',  
  external_location = 's3a://data/sample/'
);

-- select 
select * from sample;

쿼리 테스트 (dbeaver)

댓글남기기