-
Notifications
You must be signed in to change notification settings - Fork 0
Added gRPC interface definition for data cache #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
79d3db3
87b194e
441cc13
7145c9e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| syntax = "proto3"; | ||
|
|
||
| package services.data_cache; | ||
|
|
||
| service DataCacheService { | ||
| // Create a topic with partitions and replication factor | ||
| rpc CreateMQ(CreateMQRequest) returns (CreateMQResponse); | ||
|
|
||
| // Create many topics with identical partitions and replication factor | ||
| rpc CreateNMQs(CreateMQsRequest) returns (CreateMQResponse); | ||
|
|
||
| // Send a single message to a topic | ||
| rpc PushMQ(PushMQRequest) returns (ProducersResponse); | ||
|
|
||
| // Send a single message to many topic | ||
| rpc PushMQs(PushMQsRequest) returns (ProducersResponse); | ||
|
|
||
| // Get message from a topic | ||
| rpc PullMQ(PullMQRequest) returns (ConsumersResponse); | ||
|
|
||
| // Get messages from many topics | ||
| rpc PullMQs(PullMQsRequest) returns (ConsumersResponse); | ||
| } | ||
|
|
||
| message CreateMQRequest { | ||
| string mq_name = 1; // topic name | ||
| int32 partitions = 2; // Each topic can be devided into multiple partitions | ||
| int32 replication_factor = 3; // replication for high availibility | ||
| } | ||
|
|
||
| message CreateMQsRequest { | ||
| string mq_names = 1; // topic names separed with ',' e.g. topic1, topic2, topic3 | ||
| int32 partitions = 2; // Each topic can be devided into multiple partitions | ||
| int32 replication_factor = 3; // replication for high availibility | ||
| } | ||
|
|
||
| message CreateMQResponse { | ||
| bool success = 1; // success or failure | ||
| string message = 2; // response (error message if failure) | ||
| } | ||
|
|
||
| message PushMQRequest { | ||
| string mq_name = 1; // topic name | ||
| string key = 2; // optional | ||
| bytes value = 3; // message | ||
| } | ||
|
|
||
| message PushMQsRequest { | ||
| string mq_names = 1; // list of topics | ||
| string key = 2; // optional | ||
| bytes value = 3; // message | ||
| } | ||
|
|
||
| message ProducersResponse { | ||
| bool success = 1; // success or failure | ||
| string messages_with_offsets = 2; // response offset / error message if failure | ||
| } | ||
|
|
||
| message PullMQRequest { | ||
| string mq_name = 1; // topic name | ||
| int32 timeout_seconds = 2; // optional, stop after secs (0 = no timeout) | ||
| } | ||
|
|
||
| message PullMQsRequest { | ||
| string mq_names = 1; // list of topics | ||
| int32 timeout_seconds = 2; // optional, stop after secs (0 = no timeout) | ||
| } | ||
|
Comment on lines
+59
to
+67
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be nice to have an option to specify a starting offset. If the connection to the gRPC server drops, I don't want to get all the messages again when I reconnect.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thinking about this more, I'm seeing that the If the above is true, may I suggest a much clearer option? Use this structure instead:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But now I have a further question: are clients expected to keep track of the partitions that messages come from? If the network drops my connection to the server, am I going to have to specify the latest offset for all the partitions I know about, if I want to avoid pulling down messages I've already seen? And is this exposing too much of Kafka's guts to the clients? As a client, I'd like to see topics as a singular stream, and not need to care about partitions.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, I want to keep it generalized so that when/if kafka will get replaced then again we don't need to replace the PullMQRequest. However, I like the idea of offset. Because irrespective of any MQ (Redis, Kafka and RabbitMQ) offset can be generalized and get the message with offset from the stream. Let me update the code.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, this gets to the heart of the problem. I think we're hiding Kafka at too high of a level. By hiding it inside a singleton service like this, we have N clients using 1 Kafka connection, with new clients joining the stream at arbitrary times. The service now needs to track which client has seen what messages, or the client has to track offsets (and then the service will need to map the client's offset to the order in which messages came back from the different partitions; partition 0 and partition 1 can both have a message at offset 5, but they'll be different messages). This is a boatload of complexity being distributed into the control system, just to avoid using Kafka's out-of-the-box, performant, enterprise-scale solution (groups)! If, instead, we hide Kafka inside a library that each client depends on, and each client is therefore responsible for its own connection to Kafka, then we can use Kafka's group mechanism to let it handle reconnecting from the correct position for us. The business logic of the clients still won't know they're talking to Kafka, as that detail is hidden in the library. We can now have our cake and eat it too. Yes, this does add a little more cost to converting to a different platform than Kafka, but we'd just update the details in the library and push it to all the dependent applications. Automated tools like Dependabot will make this very easy.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me be clear once again, we are not going to support native calls to the any central service (including Kafka). |
||
|
|
||
| message ConsumersResponse { | ||
| string keys = 1; // associated key or keys | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||
| bytes values = 2; // response | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||
| string offsets = 3; // offset or many offsets | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| string partitions = 4; // partition or many partitions | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use
repeatedhere, so clients can provide a list of names.