Skip to main content

· 5 min read
zhaosimon
Michelle Wu

This blog introduces an MinIO storage solution built on HwameiStor, and clarifies the detailed test procedures about whether HwameiStor can properly support those basic features and tenant isolation function provided by MinIO.

MinIO introduction

MinIO is a high performance object storage solution with native support for Kubernetes deployments. It can provide distributed, S3-compatible, and multi-cloud storage service in public cloud, private cloud, and edge computing scenarios. MinIO is a software-defined product and released under GNU Affero General Public License v3.0. It can also run well on x86 and other standard hardware.

MinIO design

MinIO is designed to meet private cloud's requirements for high performance, in addition to all required features of object storage. MinIO features easy to use, cost-effective, and high performance in providing scalable cloud-native object storage services.

MinIO works well in traditional object storage scenarios, such as secondary storage, disaster recovery, and archiving. It also shows competitive capabilities in machine learning, big data, private cloud, hybrid cloud, and other emerging fields to well support data analysis, high performance workloads, and cloud-native applications.

MinIO architecture

MinIO is designed for the cloud-native architecture, so it can be run as a lightweight container and managed by external orchestration tools like Kubernetes.

The MinIO package comprises of static binary files less than 100 MB. This small package enables it to efficiently use CPU and memory resources even with high workloads and can host a large number of tenants on shared hardware.

MinIO's architecture is as follows:

Architecture

MinIO can run on a standard server that have installed proper local drivers (JBOD/JBOF). An MinIO cluster has a totally symmetric architecture. In other words, each server provide same functions, without any name node or metadata server.

MinIO can write both data and metadata as objects, so there is no need to use metadata servers. MinIO provides erasure coding, bitrot protection, encryption and other features in a strict and consistent way.

Each MinIO cluster is a set of distributed MinIO servers, one MinIO process running on each node.

MinIO runs in a userspace as a single process, and it uses lightweight co-routines for high concurrence. It divides drivers into erasure sets (generally 16 drivers in each set), and uses the deterministic hash algorithm to place objects into these erasure sets.

MinIO is specifically designed for large-scale and multi-datacenter cloud storage service. Tenants can run their own MinIO clusters separately from others, getting rid of interruptions from upgrade or security problems. Tenants can scale up by connecting multi clusters across geographical regions.

node-distribution-setup

Build test environment

Deploy Kubernetes cluster

A Kubernetes cluster was deployed with three virtual machines: one as the master node and two as worker nodes. The kubelet version is 1.22.0.

k8s-cluster

Deploy HwameiStor local storage

Deploy HwameiStor local storage on Kubernetes:

check HwameiStor local storage

Allocate five disks (SDB, SDC, SDD, SDE, and SDF) for each worker node to support HwameiStor local disk management:

lsblk

lsblk

Check node status of local storage:

get-lsn

Create storageClass:

get-sc

Deploy distributed multi-tenant cluster (minio-operator)

This section will show how to deploy minio-operator, how to create a tenant, and how to configure HwameiStor local volumes.

Deploy minio-operator

  1. Copy minio-operator repo to your local environment

    git clone <https://github.com/minio/operator.git>

    helm-repo-list

    ls-operator

  2. Enter helm operator directory /root/operator/helm/operator

    ls-pwd

  3. Deploy the minio-operator instance

    helm install minio-operator \
    --namespace minio-operator \
    --create-namespace \
    --generate-name .
    --set persistence.storageClass=local-storage-hdd-lvm .
  4. Check minio-operator running status

    get-all

Create tenants

  1. Enter the /root/operator/examples/kustomization/base directory and change tenant.yaml

    git-diff-yaml

  2. Enter the /root/operator/helm/tenant/ directory and change values.yaml

    git-diff-values.yaml

  3. Enter /root/operator/examples/kustomization/tenant-lite directory and change kustomization.yaml

    git-diff-kustomization-yaml

  4. Change tenant.yaml

    git-diff-tenant-yaml02

  5. Change tenantNamePatch.yaml

    git-diff-tenant-name-patch-yaml

  6. Create a tenant

    kubectl apply –k . 
  7. Check resource status of the tenant minio-t1

    kubectl-get-all-nminio-tenant

  8. To create another new tenant, you can first create a new directory tenant (in this example tenant-lite-2) under /root/operator/examples/kustomization and change the files listed above

    pwd-ls-ls

  9. Run kubectl apply –k . to create the new tenant minio-t2

    kubectl-get-all-nminio

Configure HwameiStor local volumes

Run the following commands in sequence to finish this configuration:

kubectl get statefulset.apps/minio-t1-pool-0 -nminio-tenant -oyaml

local-storage-hdd-lvm

kubectl get pvc –A

kubectl-get-pvc

kubectl get pvc export-minio6-0 -nminio-6 -oyaml

kubectl-get-pvc-export-oyaml

kubectl get pv

kubectl-get-pv

kubectl get pvc data0-minio-t1-pool-0-0 -nminio-tenant -oyaml

kubectl-get-pvc-oyaml

kubectl get lv

kubectl-get-lv

kubect get lvr

kubectl-get-lvr

Test HwameiStor's support for MinIo

With the above settings in place, now let's test basic features and tenant isolation.

Test basic features

  1. Log in to minio console:10.6.163.52:30401/login

    minio-opeartor-console-login

  2. Get JWT by kubectl minio proxy -n minio-operator

    minio-opeartor-console-login

  3. Browse and manage information about newly-created tenants

    tenant01

    tenant02

    tenant03

    tenant04

    tenant05

    tenant06

  4. Log in as tenant minio-t1 (Account: minio)

    login-minio

    login-minio

  5. Browse bucket bk-1

    view-bucket-1

    view-bucket-1

    view-bucket-1

  6. Create a new bucket bk-1-1

    create-bucket-1-1

    create-bucket-1-1

    create-bucket-1-1

  7. Create path path-1-2

    create-path-1-2

    create-path-1-2

  8. Upload the file

    upload-file

    upload-file

    upload-file

  9. Upload the folder

    upload-folder

    upload-folder

    upload-folder

    upload-folder

  10. Create a user with read-only permission

    create-user

    create-user

Test tenant isolation

  1. Log in as tenant minio-t2

    login-t2

    login-t2

  2. Only minio-t2 information is visible. You cannot see information about tenant minio-t1.

    only-t2

  3. Create bucket

    create-bucket

    create-bucket

  4. Create path

    create-path

    create-path

  5. Upload the file

    upload-file

    upload-file

  6. Create a user

    create-user

    create-user

    create-user

    create-user

    create-user

  7. Configure user policies

    user-policy

    user-policy

  8. Delete a bucket

    delete-bucket

    delete-bucket

    delete-bucket

    delete-bucket

    delete-bucket

    delete-bucket

Conclusion

In this test, we successfully deployed MinIO distributed object storage on the basis of Kubernetes 1.22 and the HwameiStor local storage. We performed the basic feature test, system security test, and operation and maintenance management test.

All tests are passed, proving HwameiStor can well support for MinIO.

· 9 min read
Michelle Wu

Livestream Highlights

The rapid development of cloud native industry makes many things possible, but it also poses some challenges for application development and operation. As more and more stateful applications are moved onto container platforms, stateful applications have become a major consumer of storage. As a cornerstone for running applications, storage now becomes a major challenge in the process of containerization. In the era of cloud native, what are the new requirements for storage systems? What are the new opportunities and challenges facing cloud-native storage? What are the popular cloud native storage solutions?

On June 20th, 2022, we had a live discussion with Pengyun Network on three topics: cloud-native local storage, cloud-native storage solutions for Kubernetes, and the storage systems needed in the era of cloud native.

HwameiStor

As an infrastructure for containerization in cloud native stack, cloud-native storage exposes underlying storage services to containers and micro-services, and collects storage resources from different media. It can enable stateful workloads to run in containers by providing persistent volumes. CNCF's definition for cloud-native storage has three key points. First, it runs on Kubernetes as a container. Second, it uses Kubernetes object classes, mainly custom resource definition (CRD). Last and the most important, it uses a container storage interface (CSI). HwameiStor is developed based on these three points. it is an end-to-end cloud-native local storage system.

HwameiStor Architecture

The bottom layer of HwameiStor is a local disk manager (LDM). LDM automatically collects resources from various storage media (such as HDD, SSD and NVMe disks) to form a local storage resource pool for unified automatic management. After HwameiStor is deployed, it can allocate resources on various storage media to different pools. Above the resource pool, it uses logical volume manager (LVM) for management, It also uses the CSI architecture to provide distributed local data volume service, so as to provide data persistence capabilities for cloud-native stateful applications or components.

HwameiStor is a storage system designed for cloud native. It supports high availability, automation, rapid deployment, and high performance with low cost, making it a good alternative to expensive traditional storage area network (SAN). It has three core components:

  • Local Disk Management (LDM), which uses CRD to define and manage local data disks. LDM can explicitly get the attributes, size and other information of local disks in Kubernetes.

  • Local Storage (LS), which uses LVM to allocate logical volumes (LV) to persistent volumes (PV) after LDM is implemented.

  • Scheduler, which schedules containers to nodes with local data.

The core of HwameiStor lies in the definition and implementation of CRDs. On top of PersistentVolume (PV) and PersistentVolumeClaim (PVC) object classes in Kubernetes, HwameiStor defines a new object class that can associate PV/PVC in Kubernetes with local data disks.

HwameiStor has four features:

  • Automatic operation and management: it can automatically discover, identify, manage, and allocate disks, and schedule applications and data according to affinity. It can also automatically monitor disk status and provide timely warning.

  • Highly available data support: HwameiStor uses cross-node replicas to synchronize data and achieve high availability. When a node goes wrong, it will automatically schedule applications to highly available data nodes to ensure application continuity.

  • Rich types of data volume: HwameiStor aggregates HDD, SSD, and NVMe disks to provide data services with low latency and high throughput.

  • Flexible and dynamic linear expansion:HwameiStor can dynamically expand according to cluster size to meet application's requirements for data persistence.

HwameiStor is recommended in the following scenarios:

  • Adapt to middlewares with a highly available architecture

    Some middlewares, like Kafka, ElasticSearch and Redis, have a highly available architecture and a high requirement for IO data access. The LVM-based single-replica local data volumes provided by HwameiStor are suitable for such kind of applications.

  • Provide highly available data volumes for applications

    MySQL and other OLTP databases require their underlying storage systems to provide highly available data storage for quick recovery. Meanwhile, data access should also have a high performance. The highly available dual-replica data volumes with can meet these requirements.

  • Provide automatic operation and maintenance for traditional storage software

    MinIO, Ceph and other similar storage software need to use the disks on Kubernetes nodes. HwameiStor's single-replica local volumes can fast respond to the business system's needs for deployment, expansion, and migration, thus realizing automatic operation and maintenance based on Kubernetes. This kind of local volumes can be automatically used by PVC/PV through CSI drives.

ZettaStor HASP

According to the CNCF 2020 Annual Report, stateful applications account for 55% of all container applications, making it a mainstream. About 29% stateful applications listed storage as the main challenge for adopting container technology. A storage problem facing cloud native scenarios is that it is difficult to balance performance and availability. Using a single type of storage cannot meet all needs. Therefore, in the actual implementation of stateful applications, data storage technology is the key.

Storage Solution Comparison

ZettaStor HASP is a cloud-native data aggregation & storage platform. It is also a user-state file system of high performance that supports redundancy protection based on multiple replicas across heterogeneous storage systems. Data replicas can flow between different types of storage systems, which is a big competitiveness over traditional distributed storage systems. In addition, it also supports unified and flexible orchestration of storage resources, and can be tightly integrated with container platforms.

For example:

ZettaStor HASP has a higher data access performance and can realize dynamic allocation and refined management of storage resources. This is desirable for distributed applications that have data redundancy mechanisms themselves and also require high performance, such as Kafka, Redis, MongoDB, and HDFS.

ZettaStor HASP can achieve data high availability based on cross-node redundancy protection, and meanwhile ensure high performance of local storage. This is suitable for applications that have no data redundancy mechanisms and must rely on external storage, such as MySQL, PostgreSQL.

  • Critical business should prevent the risk of simultaneous failure of two nodes. ZettaStor HASP's replica redundancy protection across local and external storage systems can ensure smooth running of critical business.

ZettaStor HASP Architecture

ZettaStor HASP has a three-layer architecture. On the top is a high-performance distributed file system, which is also the core of HASP. This independently-developed file system is fully compatible with POSIX standards and supports zero data copy between user-mode and kernel-mode, giving a full play to the performance of high-speed media, such as NVMe SSD/SCM/PM. In the middle is the data service layer, which can provide services across local storage systems on different nodes, across local storage and external storage systems, and across heterogeneous external storage systems. It can be customized into single replica, multiple replicas, strong consistency, and weak consistency solutions. At the bottom is a storage management layer, which is responsible for interacting with different storage devices. It can break device barriers and connect data islands with unified data format, helping data and business get rid of device limitations. In addition, it can also schedule storage devices through CSI.

ZettaStor HASP is a distributed storage platform closely integrated with container platforms. It supports hyper-converged deployment and CSI integration. It has node affinity, and can sense the running status of user's pods, making storage behavior more adaptable.

Round Table Discussion

Q1: What is cloud-native storage

alexzhc: In a narrow sense, cloud-native storage need meet three standards. First, it should meet CSI specification and connect well with CS. Second, it should be deployed on Kubernetes as a container. Third, information in the storage system should create a new object class through CRDs and eventually be stored in Kubernetes.

fengqinah: Cloud-native storage has multiple features and can meet various demands. In addition to providing a unified storage platform that can offer different storage features for different applications, it should also connect CSI and establish a bridge between storage systems and Kubernetes for communication.

niulechuan:The most common cloud-native storage solutions are based on cloud storage or distributed storage. Meanwhile, some service providers are also trying to extend the special capabilities of traditional storage.

Q2: How should cloud-native storage support cloud-native applications

fengqinah: There are mainly two points. First, cloud-native storage should support features of cloud-native applications, because these features decide the application's requirements for storage. Second, it should meet CSI specifications to support special cloud native requirements.

niulechuan: From a performance perspective, cloud-native storage needs to meet all requirements of the CSI architecture, so that it can fit diverse cloud-native scenarios. In order to provide good underlying support and response guarantee for cloud-native applications, cloud-native storage needs efficient operation and maintenance. In real cases, cost, portability, and technical support should also be taken into consideration when designing a cloud-native storage solution. Like cloud-native applications, cloud-native storage also requires "elasticity". It should have elastic computing capabilities, scalable data, conversions between cold and hot data. In addition, cloud-native storage should also have an open ecosystem.

Q3: What are the similarities and differences between cloud-native storage and traditional storage? How about their advantages and disadvantages

alexzhc:Cloud-native storage is aggregated with Kubernetes after being deployed, while traditional storage is often separated from Kubernetes after being deployed. Cloud-native storage runs on Kubernetes, making it convenient to develop micro-services. If using traditional storage, developers might need to extend the storage API. However, the aggregated form may also, to a certain extent, cause the problem of Kubernetes easily spill over to storage, bringing difficulties to operation and maintenance. Besides, cloud-native storage also has problems about network sharing and disk load.

fengqinah:In external storage, storage nodes and computing nodes have a weak mutual impact. Back-end storage is generally distributed, with relatively high security and availability. However, in cloud native scenarios, only using external storage has certain disadvantages. It will increase network consumption, generate additional cost, and lack linkage with Kubernetes.

Q4: How should cloud-native storage empower traditional storage

niulechuan:This is a very urgent need. Kubernetes-native capabilities, such as deletion, creation, and expansion, are mainly implemented through CRDs. The community plays an active role in this process. At the same time, some capabilities of traditional storage have not yet been realized in cloud-native storage, such as cron jobs and observability. How to make cloud-native storage better empower traditional storage on platforms, give full play to their advantages, and further advance cloud native storage? This is a long way to go, and our team is working on this.

alexzhc: To add one more thing. To use a common way to connect Kubernetes and traditional storage, you should aggregate CSI drivers. However, although CSI defines some storage operation flows, it is an interface after all. Therefore, we should consider whether the CSI community should use CRDs to define some features of traditional storage, and whether service providers can define some high-level and special flows by CRDs? We should try to make Kubernetes more applicable in the storage field.

· 15 min read
zhaosimon
Michael Yao

This test environment is Kubernetes 1.22. We deployed TiDB 1.3 after the HwameiStor local storage is attached. Then, we performed the basic SQL capability test, system security test, and operation and maintenance management test.

All the tests passed successfully, it is acknowledged that HwameiStor can support distributed database application scenarios such as TiDB with high availability, strong consistency requirements, and large data scale.

Introduction to TiDB

TiDB is a distributed database product that supports OLTP (Online Transactional Processing), OLAP (Online Analytical Processing), and HTAP (Hybrid Transactional and Analytical Processing) services, compatible with key features such as MySQL 5.7 protocol and MySQL ecosystem. The goal of TiDB is to provide users with one-stop OLTP, OLAP, and HTAP solutions, which are suitable for various application scenarios such as high availability, strict requirements for strong consistency, and large data scale.

TiDB architecture

The TiDB distributed database splits the overall architecture into multiple modules that can communicate with each other. The architecture diagram is as follows:

TiDB architecture

  • TiDB Server

    The SQL layer exposes the connection endpoints of the MySQL protocol to the outside world, and is responsible for accepting connections from clients, performing SQL parsing and optimization and finally generating a distributed execution plan. The TiDB layer itself is stateless. In practice, you can start several TiDB instances. A unified access address is provided externally through load-balance components (such as LVS, HAProxy, or F5), and client connections can be evenly distributed on to these TiDB instances. The TiDB server itself does not store data, but only parses SQL and forwards the actual data read request to the underlying storage node, TiKV (or TiFlash).

  • PD (Placement Driver) Server

    The metadata management module across a TiDB cluster is responsible for storing the real-time data distribution of each TiKV node and the overall topology of the cluster, providing the TiDB Dashboard management and control interface, and assigning transaction IDs to distributed transactions. Placement Driver (PD) not only stores metadata, but also issues data scheduling commands to specific TiKV nodes based on the real-time data distribution status reported by TiKV nodes, which can be said to be the "brain" of the entire cluster. In addition, the PD itself is also composed of at least 3 nodes and has high availability capabilities. It is recommended to deploy an odd number of PD nodes.

  • Storage nodes

    • TiKV Server: In charge of storing data. From the outside, TiKV is a distributed Key-Value storage engine that provides transactions. The basic unit for storing data is Region. Each Region is responsible for storing the data of a Key Range (the block between left-closed and right-open from StartKey to EndKey). Each TiKV node is responsible for multiple Regions. TiKV API provides native support for distributed transactions at the KV key-value pair level, and provides the levels of Snapshot Isolation (SI) by default, which is also the core of TiDB's support for distributed transactions at the SQL level. After the SQL layer of TiDB completes the SQL parsing, it will convert the SQL execution plan into the actual call to the TiKV API. Therefore, the data is stored in TiKV. In addition, the TiKV data will be automatically maintained in multiple replicas (the default is three replicas), which naturally supports high availability and automatic failover.

    • TiFlash is a special storage node. Unlike ordinary TiKV nodes, data is stored in columns in TiFlash, and the main function is to accelerate analysis-based scenarios.

TiDB database storage

TiDB database storage

  • Key-Value Pair

    The choice of TiKV is the Key-Value model that provides an ordered traversal method. Two key points of TiKV data storage are:

    • A huge Map (comparable to std::map in C++) that stores Key-Value Pairs.

    • The Key-Value pairs in this Map are sorted by the binary order of the Key, that is, you can seek to the position of a certain Key, and then continuously call the Next method to obtain the Key-Value larger than this Key in an ascending order.

  • Local storage (Rocks DB)

    In any persistent storage engine, data must be saved on disk after all, and TiKV is not different. However, TiKV does not choose to write data directly to the disk, but stores the data in RocksDB, and RocksDB is responsible for the specific data storage. The reason is that developing a stand-alone storage engine requires a lot of work, especially to make a high-performance stand-alone engine, which may require various meticulous optimizations. RocksDB is a very good stand-alone KV storage engine open sourced by Facebook. It can meet various requirements of TiKV for single engine. Here we can simply consider that RocksDB is a persistent Key-Value Map on a host.

  • Raft protocol

    TiKV uses the Raft algorithm to ensure that data is not lost and error-free when a single host fails. In short, it is to replicate data to multiple hosts, so that if one host cannot provide services, replicas on other hosts can still provide services. This data replication scheme is reliable and efficient, and can deal with replica failures.

  • Region

    TiKV divides the Range by Key. A certain segment of consecutive Keys are stored on a storage node. Divide the entire Key-Value space into many segments, each segment is a series of consecutive Keys, called a Region. Try to keep the data saved in each Region within a reasonable size. Currently, the default in TiKV is no more than 96 MB. Each Region can be described by a left-closed and right-open block such as [StartKey, EndKey].

  • MVCC

    TiKV implements Multi-Version Concurrency Control (MVCC).

  • Distributed ACID transactions

    TiKV uses the transaction model used by Google in BigTable: Percolator.

Build the test environment

Kubernetes cluster

In this test, we use three VM nodes to deploy the Kubernetes cluster, including one master node and two worker nodes. Kubelete version is 1.22.0.

k8s cluster

HwameiStor local storage

  1. Deploy the HwameiStor local storage in the Kubernetes cluster

    HwameiStor local storage

  2. Configure a 100G local disk, sdb, for HwameiStor on two worker nodes respectively

    sdb1

    sdb2

  3. Create StorageClass

    create StorageClass

Deploy TiDB on Kubernetes

TiDB can be deployed on Kubernetes using TiDB Operator. TiDB Operator is an automatic operation and maintenance system for TiDB clusters on Kubernetes. It provides full lifecycle management of TiDB including deployment, upgrade, scaling, backup and recovery, and configuration changes. With TiDB Operator, TiDB can run seamlessly on public cloud or privately deployed Kubernetes clusters.

The compatibility between TiDB and TiDB Operator versions is as follows:

TiDB versionApplicable versions of TiDB Operator
devdev
TiDB >= 5.41.3
5.1 <= TiDB < 5.41.3 (recommended), 1.2
3.0 <= TiDB < 5.11.3 (recommended), 1.2, 1.1
2.1 <= TiDB < 3.01.0 (maintenance stopped)

Deploy TiDB Operator

  1. Install TiDB CRDs

    kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/manifests/crd.yaml
  2. Install TiDB Operator

    helm repo add pingcap https://charts.pingcap.org/ 
    kubectl create namespace tidb-admin
    helm install --namespace tidb-admin tidb-operator pingcap/tidb-operator --version v1.3.2 \
    --set operatorImage=registry.cn-beijing.aliyuncs.com/tidb/tidb-operator:v1.3.2 \
    --set tidbBackupManagerImage=registry.cn-beijing.aliyuncs.com/tidb/tidb-backup-manager:v1.3.2 \
    --set scheduler.kubeSchedulerImageName=registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler
  3. Check TiDB Operator components

    check TiDB Operator components

Deploy the TiDB cluster

kubectl create namespace tidb-cluster && \
kubectl -n tidb-cluster apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/basic/tidb-cluster.yaml
kubectl -n tidb-cluster apply -f https://raw.githubusercontent.com /pingcap/tidb-operator/master/examples/basic/tidb-monitor.yaml

deploy TiDB cluster

Connect the TiDB cluster

yum -y install mysql-client

connect tidb

kubectl port-forward -n tidb-cluster svc/basic-tidb 4000 > pf4000.out & 

connect TiDB cluster

connect tidb cluster

connect TiDB cluster

Check and verify the TiDB cluster status

  1. Create the Hello_world table

    create table hello_world (id int unsigned not null auto_increment primary key, v varchar(32)); 

    create Hello_world table

  2. Check the TiDB version

    select tidb_version()\G;

    check version

  3. Check the Tikv storage status

    select * from information_schema.tikv_store_status\G;

    check storage

Configure the HwameiStor storage

Create a PVC for tidb-tikv and tidb-pd from storageClass local-storage-hdd-lvm:

HwameiStor storage config

pvc1

pvc2

kubectl get po basic-tikv-0 -oyaml

mountpvc

kubectl get po basic-pd-0 -oyaml

mountpvc1

Test procedure

Basic SQL capability test

After the database cluster is deployed, we performed the following tests about basic capabilities. All are successfully passed.

Distributed transaction

Test purpose: In the case of multiple isolation levels, check if the completeness constraints of distributed data operations are supported, such as atomicity, consistency, isolation, and durability (ACID)

Test steps:

  1. Create the database: testdb

  2. Create the table t_test ( id int AUTO_INCREMENT, name varchar(32), PRIMARY KEY (id) )

  3. Run a test script

Test result: The completeness constraints of distributed data operations are supported, such as atomicity, consistency, isolation, and durability (ACID), in the case of multiple isolation levels

Object isolation

Test purpose: Check if the object isolation can be implemented by using different schemas

Test script:

create database if not exists testdb;
use testdb
create table if not exists t_test
( id bigint,
name varchar(200),
sale_time datetime default current_timestamp,
constraint pk_t_test primary key (id)
);
insert into t_test(id,name) values (1,'a'),(2,'b'),(3,'c');
create user 'readonly'@'%' identified by "readonly";
grant select on testdb.* to readonly@'%';
select * from testdb.t_test;
update testdb.t_test set name='aaa';
create user 'otheruser'@'%' identified by "otheruser";

Test result: Supported to create different schemas to implement the object isolation

Table operation support

Test purpose: Check if you can create, delete, and modifiy table data, DML, columns, partition table

Test steps: Run the test scripts step by step after connecting the database

Test script:

# Create and delete table
drop table if exists t_test;
create table if not exists t_test
( id bigint default '0',
name varchar(200) default '' ,
sale_time datetime default current_timestamp,
constraint pk_t_test primary key (id)
);
# Delete and modify
insert into t_test(id,name) values (1,'a'),(2,'b'),(3,'c'),(4,'d'),(5,'e');
update t_test set name='aaa' where id=1;
update t_test set name='bbb' where id=2;
delete from t_dml where id=5;
# Modify, add, delete columns
alter table t_test modify column name varchar(250);
alter table t_test add column col varchar(255);
insert into t_test(id,name,col) values(10,'test','new_col');
alter table t_test add column colwithdefault varchar(255) default 'aaaa';
insert into t_test(id,name) values(20,'testdefault');
insert into t_test(id,name,colwithdefault ) values(10,'test','non-default ');
alter table t_test drop column colwithdefault;
# Type of partition table (only listed part of scripts)
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT NOT NULL,
store_id INT NOT NULL
)

Test result: Supported to create, delete, and modifiy table data, DML, columns, partition table

Index support

Test purpose: Verify different indexes (unique, clustered, partitioned, Bidirectional indexes, Expression-based indexes, hash indexes, etc.) and index rebuild operations.

Test script:

alter table t_test add unique index udx_t_test (name);
# The default is clustered index of primary key
ADMIN CHECK TABLE t_test;
create index time_idx on t_test(sale_time);
alter table t_test drop index time_idx;
admin show ddl jobs;
admin show ddl job queries 156;
create index time_idx on t_test(sale_time);

Test result: Supported to create, delete, combine, and list indexes and supported for unique index

Statements

Test purpose: Check if the statements in distributed databases are supported such as if, case when, for loop, while loop, loop exit when (up to 5 kinds)

Test script:

SELECT CASE id WHEN 1 THEN 'first' WHEN 2 THEN 'second' ELSE 'OTHERS' END AS id_new  FROM t_test;
SELECT IF(id>2,'int2+','int2-') from t_test;

Test result: supported for statements such as if, case when, for loop, while loop, and loop exit when (up to 5 kinds)

Parsing execution plan

Test purpose: Check if execution plan parsing is supported for distributed databases

Test script:

explain analyze select * from t_test where id NOT IN (1,2,4);
explain analyze select * from t_test a where EXISTS (select * from t_test b where a.id=b.id and b.id<3);
explain analyze SELECT IF(id>2,'int2+','int2-') from t_test;

Test result: the execution plan is supported to parse

Binding execution plan

Test purpose: Verify the feature of binding execution plan for distributed databases

Test steps:

  1. View the current execution plan of sql statements

  2. Use the binding feature

  3. View the execution plan after the sql statement is binded

  4. Delete the binding

Test script:

explain select * from employees3 a join employees4 b on a.id = b.id where a.lname='Johnson';
explain select /*+ hash_join(a,b) */ * from employees3 a join employees4 b on a.id = b.id where a.lname='Johnson';

Test result: It may not be hash_join when hint is not used, and it must be hash_join after hint is used.

Common functions

Test purpose: Verify standard functions of distributed databases

Test result: Standard database functions are supported

Explicit/implicit transactions

Test purpose: Verify the transaction support of distributed databases

Test result: Explict and implicit transactions are supported

Character set

Test purpose: Verify the data types supported by distributed database

Test result: Only the UTF-8 mb4 character set is supported now

Lock support

Test purpose: Verify the lock implementation of distributed databases

Test result: Described how the lock is implemented, what are blockage conditions in the case of R-R/R-W/W-W, and how the deadlock is handled

Isolation levels

Test purpose: Verify the transactional isolation levels of distributed databases

Test result: Supported for si and rc isolation levels (4.0 GA version)

Distributed complex query

Test purpose: Verify the complex query capabilities of distributed databases

Test result: Supported for the distributed complex queries and operations such as inter-node joins, and supported for window functions and hierarchical queries

System security test

This section describes system security tests. After the database cluster is deployed, all the following tests are passed.

Account management and permission test

Test purpose: Verify the accout permisson management of distributed databases

Test script:

select host,user,authentication_string from mysql.user;
create user tidb IDENTIFIED by 'tidb';
select host,user,authentication_string from mysql.user;
set password for tidb =password('tidbnew');
select host,user,authentication_string,Select_priv from mysql.user;
grant select on *.* to tidb;
flush privileges ;
select host,user,authentication_string,Select_priv from mysql.user;
grant all privileges on *.* to tidb;
flush privileges ;
select * from mysql.user where user='tidb';
revoke select on *.* from tidb;
flush privileges ;
revoke all privileges on *.* from tidb;
flush privileges ;
grant select(id) on test.TEST_HOTSPOT to tidb;
drop user tidb;

Test results:

  • Supported for creating, modifying, and deleting accounts, and configuring passwords, and supported for the separation of security, audit, and data management

  • Based on different accounts, various permission control for database includes: instance, library, table, and column

Access control

Test purpose: Verify the permission access control of distributed databases, and control the database data by granting basic CRUD (create, read, update, and delete) permissions

Test script:

mysql -u root -h 172.17.49.222 -P 4000
drop user tidb;
drop user tidb1;
create user tidb IDENTIFIED by 'tidb';
grant select on tidb.* to tidb;
grant insert on tidb.* to tidb;
grant update on tidb.* to tidb;
grant delete on tidb.* to tidb;
flush privileges;
show grants for tidb;
exit;
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'select * from aa;'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'insert into aa values(2);'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'update aa set id=3;'
mysql -u tidb -h 172.17.49.222 -ptidb -P 4000 -D tidb -e 'delete from aa where id=3;'

Test result: Database data is controlled by granting the basic CRUD permissions

Whitelist

Test purpose: Verify the whitelist feature of distributed databases

Test script:

mysql -u root -h 172.17.49.102 -P 4000
drop user tidb;
create user tidb@'127.0.0.1' IDENTIFIED by 'tidb';
flush privileges;
select * from mysql.user where user='tidb';
mysql -u tidb -h 127.0.0.1 -P 4000 -ptidb
mysql -u tidb -h 172.17.49.102 -P 4000 -ptidb

Test result: Supported for the IP whitelist feature and supportred for matching actions with IP segments

Operation log

Test purpose: Verify the monitor capability to distributed databases

Test script: kubectl -ntidb-cluster logs tidb-test-pd-2 --tail 22

Test result: Record key actions or misoperations performed by users through the operation and maintenance management console or API

Operation and maintenance test

This section describes the operation and maintenance test. After the database cluster is deployed, the following operation and maintenance tests are all passed.

Import and export data

Test purpose: Verify the tools support for importing and exporting data of distributed databases

Test script:

select * from sbtest1 into outfile '/sbtest1.csv';
load data local infile '/sbtest1.csv' into table test100;

Test result: Supported for importing and exporting table, schema, and database

Slow log query

Test purpose: Get the SQL info by slow query

Prerequisite: The SQL execution time shall be longer than the configured threshold for slow query, and the SQL execution is completed

Test steps:

  1. Adjust the slow query threshold to 100 ms

  2. Run SQL

  3. View the slow query info from log, system table, or dashboard

Test script:

show variables like 'tidb_slow_log_threshold';
set tidb_slow_log_threshold=100;
select query_time, query from information_schema.slow_query where is_internal = false order by query_time desc limit 3;

Test result: Can get the slow query info.

For details about test data, see TiDB on HwameiStor Deployment and Test Logs.

· 5 min read
Lechuan Niu
Michelle Wu

In Kubernetes, when a PVC is created and uses HwameiStor as its local storage, HwameiStor will create two kinds of CR: LocalVolume and LocalVolumeReplica. Why create these two resources for one PV? Continue to read and you will find the answer.

LV Replicas

LocalVolume

LocalVolume is a CRD defined by HwameiStor. It is the volume that HwameiStor provides for users. Each LocalVolume corresponds to a PersistentVolume of Kubernetes. Both are volumes, but LocalVolume stores HwameiStor-related information, while the other records information about Kubernetes itself and links it to LocalVolume.

You can check details of LocalVolume with this command:

#  check status of local volume and volume replica
$ kubectl get lv # or localvolume
NAME POOL KIND REPLICAS CAPACITY ACCESSIBILITY STATE RESOURCE PUBLISHED AGE
pvc-996b05e8-80f2-4240-ace4-5f5f250310e2 LocalStorage_PoolHDD LVM 1 1073741824 k8s-node1 Ready -1 22m

Now that HwameiStor can use LocalVolume to provide a volume, why do we still need LocalVolumeReplica?

LocalVolumeReplica

LocalVolumeReplica is another CRD defined by HwameiStor. It represents a replica of a volume.

In HwameiStor, LocalVolume can specify one of its LocalVolumeReplica as the active replica. As a volume, LocalVolume can have many LocalVolumeReplica as its replicas. The replica in active state will be mounted by applications and others will stand by as high available replicas.

You can check details of LocalVolumeReplica with this command:

$ kubectl get lvr # or localvolumereplica
NAME KIND CAPACITY NODE STATE SYNCED DEVICE AGE
pvc-996b05e8-80f2-4240-ace4-5f5f250310e2-v5scm9 LVM 1073741824 k8s-node1 Ready true /dev/LocalStorage_PoolHDD/pvc-996b05e8-80f2-4240-ace4-5f5f250310e2 80s

LocalVolumeReplica allows HwameiStor to support features like HA, migration, hot standby of volumes and fast recovery of Kubernetes applications, making it more competitive as a local storage tool.

Conclusion

LocalVolume and LocalVolumeReplica are common concepts in many storage products, but each product can have its own competitive and unique features based on these two concepts. A technical difficulty can be solved with different solutions, so these concepts are also suitable for different production scenarios.

We will provide more capabilities for more scenarios in future releases. Both users and developers are welcome to join us!

· 2 min read
Jie Li
Michael Yao

News today: The HwameiStor Reliable Helper System, an automatic and reliable cloud-native local storage maintenance system dedicated to system operation and maintenance, has been launched.

DaoCloud officially opens source of HwameiStor Reliable Helper System, a cloud native, automatic, reliable local storage maintenance system. This system is still in the alpha stage. HwameiStor creates a local storage pool with HDD, SSD, and NVMe disks for a central management. As the underlying data base used by applications, disks often face risks such as natural and unintentional damage. In this case, the Reliable Helper System comes out for disk operation and maintenance. All developers and enthusiasts are welcome to try it out.

System architecture

In the cloud native era, application developers can focus on the business logic itself, while the agility, scalability, and reliability required by the application runtime attribute to the infrastructure platform and O\&M team. The HwameiStor Reliable Helper System is a reliable operation and maintenance system that meets the requirements of the cloud-native era. Currently, it supports the feature of one-click disk replacement.

Comprehensively enhance the operation and maintenance

Reliable, one-click replacement, alert reminder

  • Reliable data migration and backfill

    Automatically recognize RAID disks and determine if the data migration and backfill is required to guarantee data reliability.

  • One-click disk replacement

    This feature is implemented by using the disk uuid.

  • Intuitive alert reminder

    If any exceptions occur in the process of one-click disk replacement, the system will raise an alert to remind you.

Join us

If the coming future is an era of intelligent Internet, developers will be the pioneers to that milestone, and the open source community will become the "metaverse" of developers.

If you have any questions about the HwameiStor cloud-native local storage system, welcome to join the community to explore this metaverse world dedicated for developers and grow together.

· One min read
Michael Yao

Welcome to the Hwameistor blog space.

Here you can keep up with the progress of the Hwameistor open source project and recent hot topics.

We also plan to include release notes for major releases, guidance articles, community-related events, and possibly some development tips, and interesting topics within the team.

If you are interested in contributing to this open source project and would like to join the discussion or make some guest blog posts, please contact us.

GitHub address is: https://github.com/hwameistor