Skip to main content

2 posts tagged with "Test"

View All Tags

· 5 min read
zhaosimon
Michelle Wu

This blog introduces an MinIO storage solution built on HwameiStor, and clarifies the detailed test procedures about whether HwameiStor can properly support those basic features and tenant isolation function provided by MinIO.

MinIO introduction

MinIO is a high performance object storage solution with native support for Kubernetes deployments. It can provide distributed, S3-compatible, and multi-cloud storage service in public cloud, private cloud, and edge computing scenarios. MinIO is a software-defined product and released under GNU Affero General Public License v3.0. It can also run well on x86 and other standard hardware.

MinIO design

MinIO is designed to meet private cloud's requirements for high performance, in addition to all required features of object storage. MinIO features easy to use, cost-effective, and high performance in providing scalable cloud-native object storage services.

MinIO works well in traditional object storage scenarios, such as secondary storage, disaster recovery, and archiving. It also shows competitive capabilities in machine learning, big data, private cloud, hybrid cloud, and other emerging fields to well support data analysis, high performance workloads, and cloud-native applications.

MinIO architecture

MinIO is designed for the cloud-native architecture, so it can be run as a lightweight container and managed by external orchestration tools like Kubernetes.

The MinIO package comprises of static binary files less than 100 MB. This small package enables it to efficiently use CPU and memory resources even with high workloads and can host a large number of tenants on shared hardware.

MinIO's architecture is as follows:

Architecture

MinIO can run on a standard server that have installed proper local drivers (JBOD/JBOF). An MinIO cluster has a totally symmetric architecture. In other words, each server provide same functions, without any name node or metadata server.

MinIO can write both data and metadata as objects, so there is no need to use metadata servers. MinIO provides erasure coding, bitrot protection, encryption and other features in a strict and consistent way.

Each MinIO cluster is a set of distributed MinIO servers, one MinIO process running on each node.

MinIO runs in a userspace as a single process, and it uses lightweight co-routines for high concurrence. It divides drivers into erasure sets (generally 16 drivers in each set), and uses the deterministic hash algorithm to place objects into these erasure sets.

MinIO is specifically designed for large-scale and multi-datacenter cloud storage service. Tenants can run their own MinIO clusters separately from others, getting rid of interruptions from upgrade or security problems. Tenants can scale up by connecting multi clusters across geographical regions.

node-distribution-setup

Build test environment

Deploy Kubernetes cluster

A Kubernetes cluster was deployed with three virtual machines: one as the master node and two as worker nodes. The kubelet version is 1.22.0.

k8s-cluster

Deploy HwameiStor local storage

Deploy HwameiStor local storage on Kubernetes:

check HwameiStor local storage

Allocate five disks (SDB, SDC, SDD, SDE, and SDF) for each worker node to support HwameiStor local disk management:

lsblk

lsblk

Check node status of local storage:

get-lsn

Create storageClass:

get-sc

Deploy distributed multi-tenant cluster (minio-operator)

This section will show how to deploy minio-operator, how to create a tenant, and how to configure HwameiStor local volumes.

Deploy minio-operator

  1. Copy minio-operator repo to your local environment

    git clone <https://github.com/minio/operator.git>

    helm-repo-list

    ls-operator

  2. Enter helm operator directory /root/operator/helm/operator

    ls-pwd

  3. Deploy the minio-operator instance

    helm install minio-operator \
    --namespace minio-operator \
    --create-namespace \
    --generate-name .
    --set persistence.storageClass=local-storage-hdd-lvm .
  4. Check minio-operator running status

    get-all

Create tenants

  1. Enter the /root/operator/examples/kustomization/base directory and change tenant.yaml

    git-diff-yaml

  2. Enter the /root/operator/helm/tenant/ directory and change values.yaml

    git-diff-values.yaml

  3. Enter /root/operator/examples/kustomization/tenant-lite directory and change kustomization.yaml

    git-diff-kustomization-yaml

  4. Change tenant.yaml

    git-diff-tenant-yaml02

  5. Change tenantNamePatch.yaml

    git-diff-tenant-name-patch-yaml

  6. Create a tenant

    kubectl apply –k . 
  7. Check resource status of the tenant minio-t1

    kubectl-get-all-nminio-tenant

  8. To create another new tenant, you can first create a new directory tenant (in this example tenant-lite-2) under /root/operator/examples/kustomization and change the files listed above

    pwd-ls-ls

  9. Run kubectl apply –k . to create the new tenant minio-t2

    kubectl-get-all-nminio

Configure HwameiStor local volumes

Run the following commands in sequence to finish this configuration:

kubectl get statefulset.apps/minio-t1-pool-0 -nminio-tenant -oyaml

local-storage-hdd-lvm

kubectl get pvc –A

kubectl-get-pvc

kubectl get pvc export-minio6-0 -nminio-6 -oyaml

kubectl-get-pvc-export-oyaml

kubectl get pv

kubectl-get-pv

kubectl get pvc data0-minio-t1-pool-0-0 -nminio-tenant -oyaml

kubectl-get-pvc-oyaml

kubectl get lv

kubectl-get-lv

kubect get lvr

kubectl-get-lvr

Test HwameiStor's support for MinIo

With the above settings in place, now let's test basic features and tenant isolation.

Test basic features

  1. Log in to minio console:10.6.163.52:30401/login

    minio-opeartor-console-login

  2. Get JWT by kubectl minio proxy -n minio-operator

    minio-opeartor-console-login

  3. Browse and manage information about newly-created tenants

    tenant01

    tenant02

    tenant03

    tenant04

    tenant05

    tenant06

  4. Log in as tenant minio-t1 (Account: minio)

    login-minio

    login-minio

  5. Browse bucket bk-1

    view-bucket-1

    view-bucket-1

    view-bucket-1

  6. Create a new bucket bk-1-1

    create-bucket-1-1

    create-bucket-1-1

    create-bucket-1-1

  7. Create path path-1-2

    create-path-1-2

    create-path-1-2

  8. Upload the file

    upload-file

    upload-file

    upload-file

  9. Upload the folder

    upload-folder

    upload-folder

    upload-folder

    upload-folder

  10. Create a user with read-only permission

    create-user

    create-user

Test tenant isolation

  1. Log in as tenant minio-t2

    login-t2

    login-t2

  2. Only minio-t2 information is visible. You cannot see information about tenant minio-t1.

    only-t2

  3. Create bucket

    create-bucket

    create-bucket

  4. Create path

    create-path

    create-path

  5. Upload the file

    upload-file

    upload-file

  6. Create a user

    create-user

    create-user

    create-user

    create-user

    create-user

  7. Configure user policies

    user-policy

    user-policy

  8. Delete a bucket

    delete-bucket

    delete-bucket

    delete-bucket

    delete-bucket

    delete-bucket

    delete-bucket

Conclusion

In this test, we successfully deployed MinIO distributed object storage on the basis of Kubernetes 1.22 and the HwameiStor local storage. We performed the basic feature test, system security test, and operation and maintenance management test.

All tests are passed, proving HwameiStor can well support for MinIO.

· 15 min read
zhaosimon
Michael Yao

This test environment is Kubernetes 1.22. We deployed TiDB 1.3 after the HwameiStor local storage is attached. Then, we performed the basic SQL capability test, system security test, and operation and maintenance management test.

All the tests passed successfully, it is acknowledged that HwameiStor can support distributed database application scenarios such as TiDB with high availability, strong consistency requirements, and large data scale.

Introduction to TiDB

TiDB is a distributed database product that supports OLTP (Online Transactional Processing), OLAP (Online Analytical Processing), and HTAP (Hybrid Transactional and Analytical Processing) services, compatible with key features such as MySQL 5.7 protocol and MySQL ecosystem. The goal of TiDB is to provide users with one-stop OLTP, OLAP, and HTAP solutions, which are suitable for various application scenarios such as high availability, strict requirements for strong consistency, and large data scale.

TiDB architecture

The TiDB distributed database splits the overall architecture into multiple modules that can communicate with each other. The architecture diagram is as follows:

TiDB architecture

  • TiDB Server

    The SQL layer exposes the connection endpoints of the MySQL protocol to the outside world, and is responsible for accepting connections from clients, performing SQL parsing and optimization and finally generating a distributed execution plan. The TiDB layer itself is stateless. In practice, you can start several TiDB instances. A unified access address is provided externally through load-balance components (such as LVS, HAProxy, or F5), and client connections can be evenly distributed on to these TiDB instances. The TiDB server itself does not store data, but only parses SQL and forwards the actual data read request to the underlying storage node, TiKV (or TiFlash).

  • PD (Placement Driver) Server

    The metadata management module across a TiDB cluster is responsible for storing the real-time data distribution of each TiKV node and the overall topology of the cluster, providing the TiDB Dashboard management and control interface, and assigning transaction IDs to distributed transactions. Placement Driver (PD) not only stores metadata, but also issues data scheduling commands to specific TiKV nodes based on the real-time data distribution status reported by TiKV nodes, which can be said to be the "brain" of the entire cluster. In addition, the PD itself is also composed of at least 3 nodes and has high availability capabilities. It is recommended to deploy an odd number of PD nodes.

  • Storage nodes

    • TiKV Server: In charge of storing data. From the outside, TiKV is a distributed Key-Value storage engine that provides transactions. The basic unit for storing data is Region. Each Region is responsible for storing the data of a Key Range (the block between left-closed and right-open from StartKey to EndKey). Each TiKV node is responsible for multiple Regions. TiKV API provides native support for distributed transactions at the KV key-value pair level, and provides the levels of Snapshot Isolation (SI) by default, which is also the core of TiDB's support for distributed transactions at the SQL level. After the SQL layer of TiDB completes the SQL parsing, it will convert the SQL execution plan into the actual call to the TiKV API. Therefore, the data is stored in TiKV. In addition, the TiKV data will be automatically maintained in multiple replicas (the default is three replicas), which naturally supports high availability and automatic failover.

    • TiFlash is a special storage node. Unlike ordinary TiKV nodes, data is stored in columns in TiFlash, and the main function is to accelerate analysis-based scenarios.

TiDB database storage

TiDB database storage

  • Key-Value Pair

    The choice of TiKV is the Key-Value model that provides an ordered traversal method. Two key points of TiKV data storage are:

    • A huge Map (comparable to std::map in C++) that stores Key-Value Pairs.

    • The Key-Value pairs in this Map are sorted by the binary order of the Key, that is, you can seek to the position of a certain Key, and then continuously call the Next method to obtain the Key-Value larger than this Key in an ascending order.

  • Local storage (Rocks DB)

    In any persistent storage engine, data must be saved on disk after all, and TiKV is not different. However, TiKV does not choose to write data directly to the disk, but stores the data in RocksDB, and RocksDB is responsible for the specific data storage. The reason is that developing a stand-alone storage engine requires a lot of work, especially to make a high-performance stand-alone engine, which may require various meticulous optimizations. RocksDB is a very good stand-alone KV storage engine open sourced by Facebook. It can meet various requirements of TiKV for single engine. Here we can simply consider that RocksDB is a persistent Key-Value Map on a host.

  • Raft protocol

    TiKV uses the Raft algorithm to ensure that data is not lost and error-free when a single host fails. In short, it is to replicate data to multiple hosts, so that if one host cannot provide services, replicas on other hosts can still provide services. This data replication scheme is reliable and efficient, and can deal with replica failures.

  • Region

    TiKV divides the Range by Key. A certain segment of consecutive Keys are stored on a storage node. Divide the entire Key-Value space into many segments, each segment is a series of consecutive Keys, called a Region. Try to keep the data saved in each Region within a reasonable size. Currently, the default in TiKV is no more than 96 MB. Each Region can be described by a left-closed and right-open block such as [StartKey, EndKey].

  • MVCC

    TiKV implements Multi-Version Concurrency Control (MVCC).

  • Distributed ACID transactions

    TiKV uses the transaction model used by Google in BigTable: Percolator.

Build the test environment

Kubernetes cluster

In this test, we use three VM nodes to deploy the Kubernetes cluster, including one master node and two worker nodes. Kubelete version is 1.22.0.

k8s cluster

HwameiStor local storage

  1. Deploy the HwameiStor local storage in the Kubernetes cluster

    HwameiStor local storage

  2. Configure a 100G local disk, sdb, for HwameiStor on two worker nodes respectively

    sdb1

    sdb2

  3. Create StorageClass

    create StorageClass

Deploy TiDB on Kubernetes

TiDB can be deployed on Kubernetes using TiDB Operator. TiDB Operator is an automatic operation and maintenance system for TiDB clusters on Kubernetes. It provides full lifecycle management of TiDB including deployment, upgrade, scaling, backup and recovery, and configuration changes. With TiDB Operator, TiDB can run seamlessly on public cloud or privately deployed Kubernetes clusters.

The compatibility between TiDB and TiDB Operator versions is as follows:

TiDB versionApplicable versions of TiDB Operator
devdev
TiDB >= 5.41.3
5.1 <= TiDB < 5.41.3 (recommended), 1.2
3.0 <= TiDB < 5.11.3 (recommended), 1.2, 1.1
2.1 <= TiDB < 3.01.0 (maintenance stopped)

Deploy TiDB Operator

  1. Install TiDB CRDs

    kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/manifests/crd.yaml
  2. Install TiDB Operator

    helm repo add pingcap https://charts.pingcap.org/ 
    kubectl create namespace tidb-admin
    helm install --namespace tidb-admin tidb-operator pingcap/tidb-operator --version v1.3.2 \
    --set operatorImage=registry.cn-beijing.aliyuncs.com/tidb/tidb-operator:v1.3.2 \
    --set tidbBackupManagerImage=registry.cn-beijing.aliyuncs.com/tidb/tidb-backup-manager:v1.3.2 \
    --set scheduler.kubeSchedulerImageName=registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler
  3. Check TiDB Operator components

    check TiDB Operator components

Deploy the TiDB cluster

kubectl create namespace tidb-cluster && \
kubectl -n tidb-cluster apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/basic/tidb-cluster.yaml
kubectl -n tidb-cluster apply -f https://raw.githubusercontent.com /pingcap/tidb-operator/master/examples/basic/tidb-monitor.yaml

deploy TiDB cluster

Connect the TiDB cluster

yum -y install mysql-client

connect tidb

kubectl port-forward -n tidb-cluster svc/basic-tidb 4000 > pf4000.out & 

connect TiDB cluster

connect tidb cluster

connect TiDB cluster

Check and verify the TiDB cluster status

  1. Create the Hello_world table

    create table hello_world (id int unsigned not null auto_increment primary key, v varchar(32)); 

    create Hello_world table

  2. Check the TiDB version

    select tidb_version()\G;

    check version

  3. Check the Tikv storage status

    select * from information_schema.tikv_store_status\G;

    check storage

Configure the HwameiStor storage

Create a PVC for tidb-tikv and tidb-pd from storageClass local-storage-hdd-lvm:

HwameiStor storage config

pvc1

pvc2

kubectl get po basic-tikv-0 -oyaml

mountpvc

kubectl get po basic-pd-0 -oyaml

mountpvc1

Test procedure

Basic SQL capability test

After the database cluster is deployed, we performed the following tests about basic capabilities. All are successfully passed.

Distributed transaction

Test purpose: In the case of multiple isolation levels, check if the completeness constraints of distributed data operations are supported, such as atomicity, consistency, isolation, and durability (ACID)

Test steps:

  1. Create the database: testdb

  2. Create the table t_test ( id int AUTO_INCREMENT, name varchar(32), PRIMARY KEY (id) )

  3. Run a test script

Test result: The completeness constraints of distributed data operations are supported, such as atomicity, consistency, isolation, and durability (ACID), in the case of multiple isolation levels

Object isolation

Test purpose: Check if the object isolation can be implemented by using different schemas

Test script:

create database if not exists testdb;
use testdb
create table if not exists t_test
( id bigint,
name varchar(200),
sale_time datetime default current_timestamp,
constraint pk_t_test primary key (id)
);
insert into t_test(id,name) values (1,'a'),(2,'b'),(3,'c');
create user 'readonly'@'%' identified by "readonly";
grant select on testdb.* to readonly@'%';
select * from testdb.t_test;
update testdb.t_test set name='aaa';
create user 'otheruser'@'%' identified by "otheruser";

Test result: Supported to create different schemas to implement the object isolation

Table operation support

Test purpose: Check if you can create, delete, and modifiy table data, DML, columns, partition table

Test steps: Run the test scripts step by step after connecting the database

Test script:

# Create and delete table
drop table if exists t_test;
create table if not exists t_test
( id bigint default '0',
name varchar(200) default '' ,
sale_time datetime default current_timestamp,
constraint pk_t_test primary key (id)
);
# Delete and modify
insert into t_test(id,name) values (1,'a'),(2,'b'),(3,'c'),(4,'d'),(5,'e');
update t_test set name='aaa' where id=1;
update t_test set name='bbb' where id=2;
delete from t_dml where id=5;
# Modify, add, delete columns
alter table t_test modify column name varchar(250);
alter table t_test add column col varchar(255);
insert into t_test(id,name,col) values(10,'test','new_col');
alter table t_test add column colwithdefault varchar(255) default 'aaaa';
insert into t_test(id,name) values(20,'testdefault');
insert into t_test(id,name,colwithdefault ) values(10,'test','non-default ');
alter table t_test drop column colwithdefault;
# Type of partition table (only listed part of scripts)
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)

Test result: Supported to create, delete, and modifiy table data, DML, columns, partition table

Index support

Test purpose: Verify different indexes (unique, clustered, partitioned, Bidirectional indexes, Expression-based indexes, hash indexes, etc.) and index rebuild operations.

Test script:

alter table t_test add unique index udx_t_test (name);
# The default is clustered index of primary key
ADMIN CHECK TABLE t_test;
create index time_idx on t_test(sale_time);
alter table t_test drop index time_idx;
admin show ddl jobs;
admin show ddl job queries 156;
create index time_idx on t_test(sale_time);

Test result: Supported to create, delete, combine, and list indexes and supported for unique index

Statements

Test purpose: Check if the statements in distributed databases are supported such as if, case when, for loop, while loop, loop exit when (up to 5 kinds)

Test script:

SELECT CASE id WHEN 1 THEN 'first' WHEN 2 THEN 'second' ELSE 'OTHERS' END AS id_new  FROM t_test;
SELECT IF(id>2,'int2+','int2-') from t_test;

Test result: supported for statements such as if, case when, for loop, while loop, and loop exit when (up to 5 kinds)

Parsing execution plan

Test purpose: Check if execution plan parsing is supported for distributed databases

Test script:

explain analyze select * from t_test where id NOT IN (1,2,4);
explain analyze select * from t_test a where EXISTS (select * from t_test b where a.id=b.id and b.id<3);
explain analyze SELECT IF(id>2,'int2+','int2-') from t_test;

Test result: the execution plan is supported to parse

Binding execution plan

Test purpose: Verify the feature of binding execution plan for distributed databases

Test steps:

  1. View the current execution plan of sql statements

  2. Use the binding feature

  3. View the execution plan after the sql statement is binded

  4. Delete the binding

Test script:

explain select * from employees3 a join employees4 b on a.id = b.id where a.lname='Johnson';
explain select /*+ hash_join(a,b) */ * from employees3 a join employees4 b on a.id = b.id where a.lname='Johnson';

Test result: It may not be hash_join when hint is not used, and it must be hash_join after hint is used.

Common functions

Test purpose: Verify standard functions of distributed databases

Test result: Standard database functions are supported

Explicit/implicit transactions

Test purpose: Verify the transaction support of distributed databases

Test result: Explict and implicit transactions are supported

Character set

Test purpose: Verify the data types supported by distributed database

Test result: Only the UTF-8 mb4 character set is supported now

Lock support

Test purpose: Verify the lock implementation of distributed databases

Test result: Described how the lock is implemented, what are blockage conditions in the case of R-R/R-W/W-W, and how the deadlock is handled

Isolation levels

Test purpose: Verify the transactional isolation levels of distributed databases

Test result: Supported for si and rc isolation levels (4.0 GA version)

Distributed complex query

Test purpose: Verify the complex query capabilities of distributed databases

Test result: Supported for the distributed complex queries and operations such as inter-node joins, and supported for window functions and hierarchical queries

System security test

This section describes system security tests. After the database cluster is deployed, all the following tests are passed.

Account management and permission test

Test purpose: Verify the accout permisson management of distributed databases

Test script:

select host,user,authentication_string from mysql.user;
create user tidb IDENTIFIED by 'tidb';
select host,user,authentication_string from mysql.user;
set password for tidb =password('tidbnew');
select host,user,authentication_string,Select_priv from mysql.user;
grant select on *.* to tidb;
flush privileges ;
select host,user,authentication_string,Select_priv from mysql.user;
grant all privileges on *.* to tidb;
flush privileges ;
select * from mysql.user where user='tidb';
revoke select on *.* from tidb;
flush privileges ;
revoke all privileges on *.* from tidb;
flush privileges ;
grant select(id) on test.TEST_HOTSPOT to tidb;
drop user tidb;

Test results:

  • Supported for creating, modifying, and deleting accounts, and configuring passwords, and supported for the separation of security, audit, and data management

  • Based on different accounts, various permission control for database includes: instance, library, table, and column

Access control

Test purpose: Verify the permission access control of distributed databases, and control the database data by granting basic CRUD (create, read, update, and delete) permissions

Test script:

mysql -u root -h 172.17.49.222 -P 4000
drop user tidb;
drop user tidb1;
create user tidb IDENTIFIED by 'tidb';
grant select on tidb.* to tidb;
grant insert on tidb.* to tidb;
grant update on tidb.* to tidb;
grant delete on tidb.* to tidb;
flush privileges;
show grants for tidb;
exit;
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'select * from aa;'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'insert into aa values(2);'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'update aa set id=3;'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'delete from aa where id=3;'

Test result: Database data is controlled by granting the basic CRUD permissions

Whitelist

Test purpose: Verify the whitelist feature of distributed databases

Test script:

mysql -u root -h 172.17.49.102 -P 4000
drop user tidb;
create user tidb@'127.0.0.1' IDENTIFIED by 'tidb';
flush privileges;
select * from mysql.user where user='tidb';
mysql -u tidb -h 127.0.0.1 -P 4000 -ptidb
mysql -u tidb -h 172.17.49.102 -P 4000 -ptidb

Test result: Supported for the IP whitelist feature and supportred for matching actions with IP segments

Operation log

Test purpose: Verify the monitor capability to distributed databases

Test script: kubectl -ntidb-cluster logs tidb-test-pd-2 --tail 22

Test result: Record key actions or misoperations performed by users through the operation and maintenance management console or API

Operation and maintenance test

This section describes the operation and maintenance test. After the database cluster is deployed, the following operation and maintenance tests are all passed.

Import and export data

Test purpose: Verify the tools support for importing and exporting data of distributed databases

Test script:

select * from sbtest1 into outfile '/sbtest1.csv';
load data local infile '/sbtest1.csv' into table test100;

Test result: Supported for importing and exporting table, schema, and database

Slow log query

Test purpose: Get the SQL info by slow query

Prerequisite: The SQL execution time shall be longer than the configured threshold for slow query, and the SQL execution is completed

Test steps:

  1. Adjust the slow query threshold to 100 ms

  2. Run SQL

  3. View the slow query info from log, system table, or dashboard

Test script:

show variables like 'tidb_slow_log_threshold';
set tidb_slow_log_threshold=100;
select query_time, query from information_schema.slow_query where is_internal = false order by query_time desc limit 3;

Test result: Can get the slow query info.

For details about test data, see TiDB on HwameiStor Deployment and Test Logs.