Experience sharing: shortcomings and pitfalls of Apache Kafka - Emil koutanov

Jiedao jdon 2021-05-04 12:36:18
experience sharing shortcomings pitfalls apache

I've helped some large customers use Kafka As a messaging backbone to build a microservice style architecture , And have a good understanding of its functions and the use cases that really make them work . But I'm definitely not Kafka's defense lawyer . Any technology that has experienced such rapid adoption of the curve is bound to polarize its audience , And attract some developers in some wrong way ,Kafka No exception . Like anything else , It takes a lot of time for you to fully understand Kafka And event flow , Then you can be fully proficient and use your abilities . Along the way , Prepare for some frustration .

I've sorted out some shortcomings , These shortcomings can cause frustration among developers , Or catch up with unsuspecting beginners . There's no special order :

Too many adjustable parameters

Kafka Medium Configuration parameters There may be countless , Not just for beginners , The same is true for experienced professionals . Maybe it's in addition to JVM The only exception to this is , I can't think of another technical tool with so many configuration parameters .

This is not to say that configuration options are not required . But some people want to know , How many of these parameters can be replaced by ergonomics , It's like Java Yes G1 What we did . therefore , Instead of specifying too many individual thresholds and tolerances , Let the operation and maintenance personnel set performance goals , And let the system get the best value set which can meet the goal .

Unsafe default values  

This is my biggest complaint about configuration options .Kafka The author puts forward several bold propositions about its order and the strength of message delivery assurance . If you think the default is wise , Then you will be forgiven , Because the default value should make security better than other competing qualities .

Kafka Default values are usually optimized for performance , And when security is critical , You need to explicitly override the default value on the client . ( Performance and security are contradictory ,kafka The default value of only takes care of performance , If you think about security over performance , So you can't use these defaults )

Fortunately, , Setting properties to ensure that security has only a small impact on performance - Kafka is still a beast . Remember the first rule of optimization : Don't do this . If Kafka's creators give more consideration , Kafka would have been better .

Some concrete examples :

  • enable.auto.commit :— The default is true, This causes users to 5 One offset per second ( from  

    auto.commit.interval.ms), Whether or not the consumer has completed the processing of the record . Usually , This is not what you want , Because it can lead to mixed message bottoming semantics - In the case of user failure , Some records may be passed twice , Other records may not be delivered at all . By default , It should be set to false, Let the client application specify commit .

  • max.in.flight.requests.per.connection

    — The default is 5, If one ( Or more ) The queued message timed out and tried again , It may result in disorderly release of messages . It should be changed here. The default is 1.

Appalling tools

Inconsistent naming of command line parameters , And the simple operation of publishing key messages requires you to skip : Deliver obscure 、 Unrecorded properties . Some native features are not even supported , For example, record head . The availability of built-in tools is Kafka Well known pain points in the community .

It's a shame . It's like buying a Ferrari , But it was delivered with plastic hub caps . For a long time , majority Kafka The practitioners gave up the ready-made CLI Utilities , And turn to other open source tools ( for example Kafdrop,Kafkacat And third party commercial products , for example Kafka Tool).

Complex boot process

The bootstrap and service discovery process that clients use to establish proxy connections is complex , And it's easy to confuse users . Initially, the client will be provided with a list of proxy addresses and ports . then , The client will connect directly to an address , Find the remaining proxy nodes , And then directly establish a new connection with the discovered node .

In a simple , In homogeneous network settings , It's very simple , Where all connections from all clients and peers traverse a single entry . In heterogeneous networks , There may be multiple entry points to isolate broker to broker communication , Internal clients living on the same local network and possibly through Internet Connected external clients .

guide / The discovery process requires special configuration , You need a dedicated listener and a set of Individual notified listeners , these The listener Will be presented to the connected client .

The crumbling Client Library

Use Java,Python,.NET and C The quality of client libraries written in languages other than / Maturity is not up to standard . If you are Java Developer , Then it's done – That's where most of the development work is concentrated . however Golang And other communities have been working hard to get a stable Library , Although some of them “ Independent ” The library has been around for years , But the number and severity of some of the errors I encounter in these languages are really relevant .

Lack of real multi tenant

According to the Kafka My defenders say , It supports multi tenancy . Its design is limited to access control lists (ACL) To isolate themes and maintain quotas , This gives the client the illusion of isolation , But isolation is not created in the management plane . This means that your refrigerator supports multi tenancy , Because it allows you to store food on different shelves .

A true multi tenant solution will provide multiple logical clusters in a larger physical cluster . These logical clusters can be managed separately ; for example , In a logical cluster ACL The configuration error of has no effect on other logical clusters .

Lack of regional awareness

Geocopy is not built into the agent , And it's recognized as high performance Kafka Cluster sum “stretch” Topologies don't mix . There's an open source project  MirrorMaker  , It's actually a pipe , Used to pump records from one cluster to another , Without retaining any key metadata ( For example, offset ).

Confluent With its proprietary tools Replicator  , This tool   Metadata will be preserved , But it's permissible Confluent Enterprise Part of the kit .

To make a long story short , Despite the above , I don't say Kafka is rubbish - contrary . Of course ,Kafka It's not without flaws . Say in a light way , The tools are sub standard .Kafka The breadth of configuration options is overwhelming , The default settings are full of pitfalls , You can shock those unsuspecting beginners at any time .

however , As an event flow platform ,Kafka Changed the way we're architecting and building complex systems . It gives us choices , It's a good thing . Its benefits go beyond the superfluous , And make those who have been so actively adopted technology tied to all the trouble .


本文为[Jiedao jdon]所创,转载请带上原文链接,感谢

  1. Spring 3中异步方法调用
  2. AOP相关讨论
  3. jf能支持的表现层目前只有struts 1.x么?
  4. 在j2ee中实现一般java对象数据库的方法。
  5. FTP connecting windows and Linux
  6. Decorator design pattern - gene zeiniss
  7. Asynchronous method call in spring 3
  8. Discussion on AOP
  9. Is struts 1. X the only presentation layer supported by JF?
  10. The method of realizing general Java object database in J2EE.
  11. PDF转HTML工具——用springboot包装pdf2htmlEX命令行工具
  12. Pdf to HTML tool -- Wrapping pdf2htmlex command line tool with springboot
  13. MySQL 的 in 查询不走索引?我拿什么拯救你!
  14. MySQL in query does not go index? What can I do to save you!
  15. PDF转HTML工具——用springboot包装pdf2htmlEX命令行工具
  16. Pdf to HTML tool -- Wrapping pdf2htmlex command line tool with springboot
  17. Java小白入门必学!最全数据类型和运算符笔记,附实例
  18. Java Xiaobai introduction must learn! Notes on the most complete data types and operators, with examples
  19. Spring MVC请求与响应
  20. Spring MVC request and response
  21. Java 11已经不再完全免费,不要陷入Oracle的Java 11陷阱
  22. Vue.js比jQuery更容易学习
  23. 启动/删除Docker容器时出现问题 - 如何修复
  24. eclipse run on server时出现了错误信息.求急!!
  25. 请教高手一个关于lunce的问题:java.io.IOException: Cannot rename ...\segments.new
  26. Java 11 is no longer completely free. Don't fall into the Java 11 trap of Oracle
  27. Vue. JS is easier to learn than jQuery
  28. Problem starting / deleting docker container - how to fix it
  29. There is an error message in eclipse run on server!!
  30. Ask a question about lunce: java.io.ioexception: cannot rename... \ segments.new
  31. 从零搭建Spring Boot脚手架(2):集成mybatis
  32. 从零搭建Spring Boot脚手架(4):手写Mybatis通用Mapper
  33. 只知道java反射,宁知道内省吗?
  34. Build spring boot scaffold from scratch (2): integrate mybatis
  35. Build spring boot scaffold from scratch (4): handwritten mybatis general mapper
  36. Do you prefer introspection to reflection?
  37. ASP调用SDK微信分享好友、朋友圈
  38. ASP calls SDK wechat to share friends and circle of friends
  39. BAT 必问的 MySQL 面试题你都会吗?
  40. Do you know all the MySQL interview questions that bat must ask?
  41. ASP调用SDK微信分享好友、朋友圈
  42. ASP calls SDK wechat to share friends and circle of friends
  43. SpringCloud(六)Bus消息总线
  44. 详解JavaScript中的正则表达式
  45. Springcloud (6) bus message bus
  46. Explain regular expressions in JavaScript
  47. Java 响应式关系数据库连接了解一下
  48. Java14它真的来了, 真是尾气都吃不到了
  49. 视频:使用Docker搭建RabbitMQ环境
  50. Java responsive relational database connection
  51. Java14 it's really coming. I can't eat the exhaust
  52. Video: building rabbitmq environment with docker
  53. SpringCloud(六)Bus消息总线
  54. 详解JavaScript中的正则表达式
  55. Springcloud (6) bus message bus
  56. Explain regular expressions in JavaScript
  57. Docker实战:用docker-compose搭建Laravel开发环境
  58. Docker: building laravel development environment with docker compose
  59. 求助,JAVA如何获取系统当前所有进程
  60. 有人用过JMeter或用HttpUnit写过测试吗????