Skip to content

ARTEMIS-5573 and ARTEMIS-5975 Improve AMQP Size estimation#6323

Draft
clebertsuconic wants to merge 4 commits intoapache:mainfrom
clebertsuconic:message-size-amqp
Draft

ARTEMIS-5573 and ARTEMIS-5975 Improve AMQP Size estimation#6323
clebertsuconic wants to merge 4 commits intoapache:mainfrom
clebertsuconic:message-size-amqp

Conversation

@clebertsuconic
Copy link
Copy Markdown
Contributor

No description provided.

@clebertsuconic clebertsuconic force-pushed the message-size-amqp branch 8 times, most recently from f70cda5 to be90ac4 Compare March 29, 2026 21:44
@clebertsuconic clebertsuconic changed the title ARTEMIS-TBD Improving memory estimates on AMQP ARTEMIS-5573 and ARTEMIS-5975 Improve AMQP Size estimation and make it static Mar 29, 2026
@clebertsuconic clebertsuconic force-pushed the message-size-amqp branch 9 times, most recently from 0611d5a to 03b7176 Compare March 30, 2026 22:51
@clebertsuconic clebertsuconic marked this pull request as ready for review March 30, 2026 22:51
@clebertsuconic clebertsuconic force-pushed the message-size-amqp branch 8 times, most recently from 94efe35 to 70beb3a Compare April 1, 2026 13:07
@clebertsuconic clebertsuconic changed the title ARTEMIS-5573 and ARTEMIS-5975 Improve AMQP Size estimation and make it static ARTEMIS-5573 and ARTEMIS-5975 Improve AMQP Size estimation Apr 1, 2026
@tabish121 tabish121 self-requested a review April 1, 2026 14:14
Copy link
Copy Markdown
Contributor

@tabish121 tabish121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, nice simplification of some of the message handling

Also making it immutable to avoid races after updates like we had in the
past


// this is "borrowed" from:
// https://github.com/apache/qpid-proton-j/blob/6dc5587f1d1b23969a8994f1755198e638e92bc4/proton-j/src/main/java/org/apache/qpid/proton/codec/messaging/FastPathApplicationPropertiesType.java#L93-L115
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was inspired by the proton-j version but isn't a copy and it omits some of the actual validation checks that are done in the proton-j version which are not as thorough as they likely should be but are still better than not validating the encoding at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tabish121 I don't understand what you're asking me to do here? update the comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tabish121 this is being called right after the readConstructor was read. the only thing that would be decoded would be either this, or the skipValue() which is doing pretty much the same as this code is doing now.

I'm confused on what are you asking

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying this does no validation on the data it reads, such as checking that the size value is smaller than the data remaining or on enforcing that the count is actually divisible by two which could indicate that the payload is corrupt if it isn't. I know proton-j does check at least the size data or maybe it checks that count isn't large than the data remaining in the buffer.

My point is that some validation might be sensible here to ensure valid data is being used up front.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is doing the same validation thst was being done before.

If you follow the previous call all cobstructor.skipvalue does is read the value and move the bytes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add the validation though, but this is the same semantic as before.


import static org.junit.jupiter.api.Assertions.assertEquals;

public class AMQPDecodeApplicationPropertiesTest {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good place to add a test to validate that an AMQP message with no application properties actually has the correct values for the new size and count value tracking.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tabish121 @gemmellr Do you guys know a way to write a NULL encoding, to validate this portion, without having me to deal with bytes directly? I was looking to add one on AMQPDecodeApplicationPropertiesTest.

https://github.com/apache/qpid-proton-j/blob/6dc5587f1d1b23969a8994f1755198e638e92bc4/proton-j/src/main/java/org/apache/qpid/proton/codec/messaging/FastPathApplicationPropertiesType.java#L111-L112

Copy link
Copy Markdown
Contributor

@tabish121 tabish121 Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what null encoding you want to write. I was referring to a message that does not include the section ApplicationProperties at all, meaning it isn't encoded just a message like, Header, MessageAnnotations, Properties and an AmqpValue section

To write an ApplicationProperties that has no actual map payload that would be something like the following

        buffer.writeByte((byte) 0); // Described Type Indicator
        buffer.writeByte(EncodingCodes.SMALLULONG);
        buffer.writeByte(ApplicationProperties.DESCRIPTOR_CODE.byteValue());
        buffer.writeByte(EncodingCodes.NULL);

Proton MessageImpl encoding with an ApplicationProperties object assigned but that has not had a Map added via setValue would probably result in this encoding.

Copy link
Copy Markdown
Contributor Author

@clebertsuconic clebertsuconic Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the codec, apparently support a value type that is empty. basically you can add EncodingCodes.NULL on the encoding and you get an empty ApplicationProperties.

I don't find it very useful but I see it on the codec.

As for the empty application properties I added a test.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do have code somewhere in the tests that can encode messages with the structure you want, I don't recall where off the top of my head but again, it simply would be assigning an instance ApplicationProperties that has no body assigned to its setValue(Map) element and encoding it which should result in an null as the encoding for that part of the described type.

@clebertsuconic clebertsuconic marked this pull request as draft April 7, 2026 21:51
@clebertsuconic
Copy link
Copy Markdown
Contributor Author

I added a commit to be squashed, returning the RELOAD state and persisting the memory estimate on the storage (journal or paging).

I honestly don't think the scan will make any difference.. The real issue is that the estimates were wrong in the first place...

But this commit should settle any discussion.

I think There are already version tests in the Version tests. but i will make sure about it soon.

@clebertsuconic
Copy link
Copy Markdown
Contributor Author

for some reason I'm having to use a different persister, and I would have to digg on previous versions to understand why.

I will do some investigation tomorrow, but most likely I will need the V4 persister.

@clebertsuconic clebertsuconic force-pushed the message-size-amqp branch 7 times, most recently from 09bb86b to 540ad20 Compare April 8, 2026 00:58
@clebertsuconic
Copy link
Copy Markdown
Contributor Author

The reason I had to add a V4 is because of Paging:

On this part here, I encode the number of queues and the queueIDs:

int queueIDsSize = buffer.readInt();
queueIDs = new long[queueIDsSize];
for (int i = 0; i < queueIDsSize; i++) {
queueIDs[i] = buffer.readLong();
}

At the point of the encoder I don't have any reference to the number of bytes used by its own decoder.

I'm adding one on V4 now, if in the future anything else is added, we will stop reading at that marker.

@clebertsuconic clebertsuconic force-pushed the message-size-amqp branch 4 times, most recently from 1cb4403 to fa1128e Compare April 8, 2026 01:26
@clebertsuconic
Copy link
Copy Markdown
Contributor Author

I intend to squash the second commit on the first as soon some review is done on the new persister.

@clebertsuconic
Copy link
Copy Markdown
Contributor Author

I'm setting it as ready to review. but this commit should be squashed before merged.

@clebertsuconic
Copy link
Copy Markdown
Contributor Author

I needed to add versioning to AMQPLargeMessagePersister as well.

I could choose to skip memory calculations on large message... however I will prefer to keep it the same way as StandardMessage.

this is also better for future additions

MemoryEstimate is now persisted with the message so we won't have to
scan it to recalculate it.

If loading a previous version, the server will at that point scan the
message.
assert record != null && AMQPStandardMessage.class.equals(record.getClass());

// This might be useful in the future if we need to add more data to this persister. For now, it's just a filler.
int sizeV4 = buffer.readInt();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what would you want to store in this extra int field that you've written the value if DataConstants.SIZE_INT which is four, so in the future you cannot write anything into there that can be four and know if you are reading current data or the old default. Would zero not be a better default as the absence of something is surely easier to deal with moving forward?

super.encode(buffer, record);

AMQPLargeMessage msgEncode = (AMQPLargeMessage) record;
buffer.writeInt(DataConstants.SIZE_INT);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as below here, writing 4 here means you've excluded the int value of four from having any meaning in this field moving forward

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the number of bytes I'm currently writing in this persister.

I actually should limit
The buffer here. And skip bytes I didn't read.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean on the reading portion.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tabish121 the idea is.. in the future, when I add another field, say, another Iteger, this becomes 8. so I know when to stop.

right now I can't do the magic comparison on buffer.hasReadableBytes() as I don't know how to delimit what belong to the persister.

However the reading portion is wrong as I should also skip for non reading values (think of inverted compatibility, from newer to odler version)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you are doing, the comments did not make that at all clear though so maybe improve them a bit?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tabish121 I added a commit with a code change here

*/
public enum MessageDataScanningStatus {
NOT_SCANNED(0), SCANNED(1);
NOT_SCANNED(0), RELOAD_PERSISTENCE(1), SCANNED(2);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So since you are now storing the memory estimate in the V4 persistent form can you not alter recoverHeaderDataFromEncoding to provide it the version you are recovering from if less than V4 you also scan and load the values from the encoded ApplicationPropreties if present (using the existing parseAndSkip method) and thereby remove the need for the RELOAD_PERSISTENCE state that you wanted to remove. The code would just need to know when to stop scanning which would be either as soon as a Header was read, or under a similar set of constraints as scanMessageData where it stops after having hit the body sections or a Footer

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add anything to not needed parsing... but I'm confused on what should I write.

You will stil need the code in place here for version compatbility I guess... (or I guess we could just do a full scan if from an older version.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhh... this is just to recover the priority out of the Header.

Sure I can write the priority as part of the persister.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tabish121 I pushed a commit that completely eliminates the scanning during reload.

I could remove the RELOAD status (still thinking about it.. it's only needed for isPersistent now. I could persist it on the persister as well and just have it NOT SCANNED.

Please check it out..

(these commits will need to be squashed)

One thing I will work on this afternoon is to add a checker for LargeMessages as well on AMQPReloadFromPersistenceTest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants