Data Structure

WILD

Administrator
Staff member
ADMIN
SELLER
SUPREME
MEMBER
Joined
Jan 21, 2025
Messages
219
Reaction score
636
Deposit
0$
Why are tables different?MySQL is a transactional database. Key features include integrity, locking, foreign keys, and point updates.ClickHouse is an analytical database. Key features include: big data scanning speed, compression, and columnar storage.Therefore, the structures are slightly different:-- MySQL: with indexes for fast lookups by ID and dateCREATE TABLE `records` (`id` char(32) NOT NULL PRIMARY KEY, -- Fast lookup by key`code` varchar(50) NOT NULL,`number` varchar(20) NOT NULL,`record_date` date NOT NULL, -- Separate date and time`record_time` datetime NOT NULL, -- for easy filtering`hour` tinyint(3) unsigned NOT NULL, -- Denormalization for frequent queries`is_completed` tinyint(1) NOT NULL,`type_id` int(11) DEFAULT NULL,`value` int(11) DEFAULT NULL,INDEX idx_date (`record_date`), -- Index for filtering by dateINDEX idx_number (`number`) -- Index for searching by number) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;Explain with-- ClickHouse: columnar storage, sorting by dateCREATE TABLE records(`id` String,`code` String,`number` String,`record_date` Date,`record_time` DateTime,`hour` UInt8,`is_completed` UInt8, -- ClickHouse doesn't have bool, UInt8 is used`type_id` Nullable(Int32), -- Nullable for optional fields`value` Nullable(Int32))ENGINE = MergeTree -- ClickHouse's main enginePARTITION BY toYYYYMM(record_date) -- Split by month for quick deletion of old partitionsPRIMARY KEY (id) -- Primary key for sortingORDER BY (id, record_date, record_time); -- Storage order on diskExplain withExplanation of ClickHouse fields:pARTITION BY — data is physically divided into folders by month. This allows for quick deletion of old data (simply delete the folder).ORDER BY — data on disk is sorted first by id, then by date. This speeds up queries filtering by these fields.UInt8 instead of bool — ClickHouse doesn't have a boolean type; an 8-bit number is used.1.2. Source table for migrationClickHouse can connect to MySQL directly through a special table engine:CREATE TABLE records_sourceENGINE = MySQL('host:3306', -- MySQL address'database', -- Database name'records', -- Table name'user', -- User'password'); -- PasswordExplain withThis is not a copy of the data, but a "window" into MySQL. When you perform a SELECT statement from this table, ClickHouse actually goes to MySQL and pulls the data from there. This is convenient for migration, but slow for frequent queries.Part 2. Working with the Server2.1. Install everything you needStep 1. Install the MySQL client# Install the MySQL clientsudo apt install -y mysql-client# Check that it's installedwhich mysql# You should see: /usr/bin/mysql# Check the versionmysql --version# You should see something like: mysql Ver 8.0.xxExplain withStep 2. Preparing to install the ClickHouse client# Install support packagessudo apt install -y apt-transport-https ca-certificates curl gnupgExplain withStep 3. Add the ClickHouse repository# Download and add the repository keycurl-fsSL https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/clickhouse.gpg# Add repository to sourcesecho "deb https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.listExplain withStep 4. Installing the ClickHouse client# Update the package list (now with the new repository)sudo apt update# Install the ClickHouse clientsudo apt install -y clickhouse-client# Check that it's installedwhich clickhouse-client# Should see: /usr/bin/clickhouse-client# Check the versionclickhouse-client --version# Should see: ClickHouse client version 24.xx.x.xExplain withStep 5. Checking installed programs# Ensure both clients are availableecho "MySQL: $(mysql --version)"echo "ClickHouse: $(clickhouse-client --version)"Explain withGreat, halfway there!2.2. Migration scriptWe'll do the migration via CRON and a Bash script so that the server itself calls the procedure, and we'll forget that it works like that.1. Create a scriptnano /path/to/migrate.shExplain with2. What to write to the file:#!/bin/bash# Connection parametersMYSQL_HOST="localhost"MYSQL_USER="user"MYSQL_PASSWORD="password"MYSQL_DB="database"CLICKHOUSE_USER="default"CLICKHOUSE_PASSWORD="password"CLICKHOUSE_DB="database"echo "=== START MIGRATION $(date) ==="# Copy yesterday's data from MySQL to ClickHouse/usr/bin/clickhouse-client -u $CLICKHOUSE_USER --password $CLICKHOUSE_PASSWORD --query "INSERT INTO $CLICKHOUSE_DB.recordsSELECT * FROM $CLICKHOUSE_DB.records_sourceWHERE record_date < CURDATE()" # <-- HF
 
Top Bottom