Java sub database and sub table

It Maple fighter 2022-01-15 03:46:30 阅读数:818

java sub database sub table

java Sub database and sub table

Why is the database divided into tables

  • stay P2P In the platform , The target information and bidding information exist as the basic business data of the platform . With the development of the platform , These data may be more and more , Even up to 100 million
  • With MySQL For example , The amount of data in a single database is 5000 The performance is better within ten thousand , When the threshold is exceeded, the performance will decrease significantly with the increase of the amount of data
  • The amount of data in a single table exceeds 1000w, Performance will also degrade severely . This query will take longer , Concurrent operations may get stuck when they reach a certain amount , Even drag the system down , So our P2P The platform needs to solve this performance bottleneck
  • Can we improve the data processing capability by improving the hardware capability of the server ?
  • can , But this scheme is expensive , And there is an upper limit to improving the hardware . Can we distribute the data in different databases , Make the data volume of a single database and table smaller , So as to improve the performance of database operation ? Sure , This is the database sub database sub table
  • Dividing databases and tables is to split large databases and data tables according to a certain strategy ?
  • The aim is : Lower per library 、 The amount of data per table , Reduce the load on the database , Improve the efficiency of database , Shorten the query time . in addition , Because the transformation of sub database and sub table is controllable , Underlying or based on RDBMS, Therefore, the entire database operation and maintenance system and related infrastructure are reusable

The way of sub database and sub table

Vertical sub table

  • When users browse goods on the e-commerce platform , The first thing you see is the basic information of the commodity , If you are interested in the product, you will continue to view the detailed description of the product

  • The access frequency of commodity basic information is higher than that of commodity detailed description information , The access efficiency of commodity basic information is higher than that of commodity detailed description information ( Large field ). Because of the difference between the two kinds of data , Therefore, consider splitting the commodity information table as follows :

  • [ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-HfNOmr4D-1635769600224)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\1635768740666.png)]

  • This split is called Vertical sub table . Vertical table definition : Spread the fields of one table into multiple tables , Each table stores some of the words ** paragraph .** The improvement brought by vertical table is :

    • Reduce IO Scramble for , Reduce the chance of locking the table , Viewing product details and product overview do not affect each other
    • Give full play to the operation efficiency of high-frequency data , The high efficiency of commodity overview data operation will not be dragged down by the low efficiency of commodity detail data operation
  • Generally speaking , The access frequency of each data item in a business entity is different , Some data items may take up more storage space BLOB or TEXT, For example, in the example above Item description field . therefore , When the amount of data is large , Sure Split the table by field , Will hot fields 、 Unpopular fields are placed separately in different tables . Performance improvement brought by vertical segmentation , It mainly focuses on the operation efficiency of popular data , And disk contention is reduced . We usually split vertically according to the following principles :

    • Put the infrequently used fields in a separate table
    • hold text,blob Such large fields are split and placed separately in a table
    • Often, the fields of combined queries are placed in a single table

Vertical sub database

  • Through the vertical sub table , Database performance has been improved to a certain extent , But it hasn't met the requirements , And the disk space is running out , Because the data is always stored on one server . The vertical table division in the library only solves the problem of too much data in a single table , But it doesn't distribute tables to libraries on different machines , Therefore, for reducing the pressure on the database , Limited role , We are still competing for the same physical machine CPU、 Memory 、 The Internet IO、 disk

  • Take e-commerce platform as an example , You can put the original SELLER_DB( Seller Library ), Split into PRODUCT_DB( Commodity bank ) and STORE_DB( Shop Library ), And distribute the two libraries to different servers , As shown in the figure below :

  • [ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-RU0sxPdy-1635769600226)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\1635768939463.png)]

  • because Commodity information And Commodity Description High business coupling , So they are stored together in PRODUCT_DB( Commodity bank ); and Store information Relatively independent , So it's stored separately in STORE_DB( Shop Library ), This is called Vertical sub database

  • Vertical sub database It refers to classifying tables according to business , Distributed to different databases , Each library can be placed on a different server , So as to achieve the effect of sharing pressure among multiple servers . The improvement brought about by vertical sub database is :

    • Solve business level coupling , Business is clear
    • It can manage the data of different services in different levels 、 maintain 、 monitor 、 Extension etc.
    • High concurrency scenarios , To a certain extent, the vertical sub database can improve IO、 Number of database connections 、 Performance of stand-alone hardware resources

Horizontal sub database

  • after Vertical sub table and vertical sub database after , The database performance problem is completely solved ? Suppose an e-commerce platform is developing rapidly ,PRODUCT_DB( Commodity bank ) The data stored in a single database has exceeded the estimate . Suppose that the current platform has 8w The store , Each store has an average of 150 Products of different specifications , Add in the growth , Then the quantity of goods will reach 1500w+ Level , also PRODUCT_DB( Commodity bank ) It belongs to a very frequent resource , Performance bottlenecks reappear

  • Can you divide the library vertically again ? From a business perspective , At present, it is impossible to split vertically again . So we thought of another way , Judge the goods ID Is it odd or even , Then store the commodity information in two databases . in other words , To manipulate a piece of data , First analysis The commodity of this data ID, If the goods ID It's odd , Map this operation to RRODUCT_DB1( Commodity bank 1); If the goods ID For the even , Map operations to RRODUCT_DB2( Commodity bank 2), This is called Horizontal sub database

  • [ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-cLX5OXgi-1635769600230)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\1635769062158.png)]

  • Horizontal sub database It is to split the data of the same table into different databases according to certain rules , Each library can be placed on a different server . The improvement it brings is :

    • Solved single database big data , The performance bottleneck of high concurrency
    • Split according to the reasonable split rules ,join Basically avoid cross library operation
    • Improve the stability and availability of the system
  • When an application is no longer fine-grained for vertical segmentation , Or the number of data rows is still huge after segmentation , There is a single library reading and writing 、 Storage performance bottlenecks , It's time to do Horizontal sub database 了 , After the optimization of horizontal segmentation , Can often solve the single inventory reserves and performance bottlenecks . But because the same table is assigned in different databases , Routing work that requires additional data manipulation , Therefore, the system complexity is greatly increased

Horizontal sub table

  • The database can be split horizontally , Is the data sheet ok ? We try to put some PRODUCT_DB( Commodity bank ) Table in , A horizontal split was made :

  • [ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-KrQYZVsv-1635769600232)(C:\Users\admin\AppData\Roaming\Typora\typora-user-images\1635769160610.png)]

  • It is similar to the idea of horizontal sub database , But the goal of this split is table , Product information and product description are divided into two sets of tables . If the goods ID It's odd , Map this operation to product information 1 surface ; If the goods ID For the even , Map operations to product information 2 surface , This is called Horizontal sub table . Horizontal sub table It's in the same database , Split the data of the same table into multiple tables according to certain rules . The improvement it brings is :

    • Optimize the performance problems caused by too much data in a single table
    • avoid IO Compete and reduce the chance of locking the watch
  • The level table in the library , It solves the problem of too much data in a single table , The separated tables contain only a part of the data , Thus, the data quantity of a single table is reduced , Improve retrieval performance . But because the data of the same table is split into multiple tables , Additional routing for data operations is also required , Therefore, the system complexity is increased

Summary

  • Vertical sub table : You can set the fields of a wide table according to the access frequency 、 Business coupling is loose and tight 、 The principle of whether a large field is split into multiple tables , This not only makes the business clear , It can also improve some performance . After break up , Try to avoid associated queries from a business perspective , Otherwise, the performance will not be worth it
  • Vertical sub database : Multiple tables can be classified by business coupling , Store them in different warehouses , These libraries can be distributed on different servers , Thus, the access pressure is loaded by multiple servers , Greatly improve performance , At the same time, it can improve the business clarity of the overall architecture , Different business libraries can customize the optimization scheme according to their own situation . But it needs to solve all the complex problems brought about by cross Library
  • Horizontal sub database : You can put the data of a table ( Press the data line ) It's divided into different warehouses , Each database has only part of the data of this table , These libraries can be distributed on different servers , Thus, the access pressure is loaded by multiple servers , Greatly improve performance . It not only needs to solve all the complex problems caused by cross Library , We also need to solve the problem of data routing
  • Horizontal sub table : You can put the data of a table ( Press the data line ) It is divided into multiple tables in the same database , Each table has only part of the data of this table , This can slightly improve performance , It is only used as a supplementary optimization of the horizontal sub database

Generally speaking , In the system design stage, we should determine the vertical sub base according to the business coupling tightness , Vertical table scheme , When the amount of data and access pressure is not particularly large , Consider caching first 、 Read / write separation 、 Index technology, etc . If the amount of data is huge , And continue to grow , Then consider the scheme of horizontal database and table

Problems caused by sub database and sub table

  • Sub database and sub table effectively alleviate the problem of big data 、 Performance and pressure caused by high concurrency , Can also break through the network IO、 Hardware resources 、 The bottleneck in the number of connections , But it also brings some problems

Transaction consistency issues

  • Because sub database and sub table distribute data in different databases or even different servers , It will inevitably bring Distributed transactions problem , We need additional programming to solve this problem

Cross node ****join

  • Before dividing the warehouse and table , We can search the products through the following SQL Query the store information :

  • SELECT p.*,s.[ Shop name ],s.[ credibility ] FROM [ Commodity information ] p LEFT JOIN [ Store information ] s ON p.id = s.[ The store it belongs to ] WHERE...ORDER BY...LIMIT...
    
  • But after dividing the database and table ,**[ Commodity information ] and [ Store information ]** Not in a database or a table , Not even on a server , Unable to get sql Statement for association query , We need additional programming to solve this problem

Paging across nodes 、 Sorting and aggregation functions

  • When querying across nodes and multiple databases ,limit Pagination 、order by Sorting and aggregation functions , It becomes more complicated . You need to sort the data in different sharding nodes and return , Then the result sets returned by different segments are summarized and re sorted . for example , The commodity warehouse after horizontal warehouse division , Press ID Reverse sort paging , Take the first page
版权声明:本文为[It Maple fighter]所创,转载请带上原文链接,感谢。 https://javamana.com/2021/12/202112122326534522.html