There may be risks in this method , Please read the article before you operate .
The commands in this article can be found in Kafka Execute in a non startup state .
Kafka After a few months of running the server , The storage space is running out . analysis Kafka The space occupied by , Find out Kafka Automatically generated “__consumer_offset”topic, It takes up a lot of space , It's used to record every user topic Consumption offset of . this topic Applicable clean-up rules and other topic Different , In some special cases , It may not have been cleaned up , Run out of server resources .
Look at the cleanup strategy
Because of Kafka Older version , The parameters used here are --zookeeper, Instead of −−bootstrap-server Parameters .
Use the following command to view the cleanup policy ：
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --entity-type topics --entity-name __consumer_offsets --describe
On this server , The following results are obtained ：
Configs for topics:__consumer_offsets are segment.bytes=104857600,cleanup.policy=compact,compression.type=uncompressed
Translation is ： Size of each file block 100MB, The clean-up strategy is Compress , The compression strategy is Uncompressed .
Of course, the server will run out of space .
First , Will deal with __consumer_offset The special clean-up policy of the ：
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --entity-type topics --entity-name __consumer_offsets --alter --delete-config cleanup.policy
It is said that this can make the cleaning strategy here more common topic It's the same , But just in case , Then manually add a set of cleanup policies ：
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --alter --entity-name __consumer_offsets --entity-type topics --add-config retention.ms=604800000
./kafka-configs.sh --zookeeper zk01:2181,zk02:2181,zk03:2181/kafka --alter --entity-name __consumer_offsets --entity-type topics --add-config cleanup.policy=delete
These two lines of command will __consumer_offset The clean-up logic is adjusted to “ clear 7 Data from the day before , The cleaning policy is to delete ”
after , take Kafka Run up , You can see a lot of data being marked for deletion , Wait for a minute （ If the delay parameter is not adjusted ）,Kafka Will automatically send 7 The data from the day before was deleted . You don't have to worry about server space in the future .
Although the server space problem has been solved , But there is also a question ： Whether this will lead to some partition of offset The record disappears , Lead to repeated consumption ？
for instance ： One topic Yes 200 Bar record , Consumers are 8 I spent it a day ago 100 strip , this 8 No consumer spending within days , There is no producer .
So if the consumer goes to spend today , Because the previous consumption record exceeded 7 There is no change in the sky , Most likely it has been deleted . Whether consumers will get from 0 Message No. 1 starts to consume ？
In the current environment ,topic Saving data also has 7 The limit of days , To avoid this problem , But it does have the potential to cause some problems .