Skip to content

WIP draft of generate() outside of chat.answer()#432

Open
dillonroach wants to merge 19 commits intoQuansight:mainfrom
dillonroach:dpr-add-llms
Open

WIP draft of generate() outside of chat.answer()#432
dillonroach wants to merge 19 commits intoQuansight:mainfrom
dillonroach:dpr-add-llms

Conversation

@dillonroach
Copy link
Copy Markdown

Takes format from chat.answer and duplicates, without sources, as chat.generate which then gets implemented in Assistant as well. Got a bit turned around with the async aspect in the stream = False case, room for cleanup. The exl2 assistant is currently instantiating a number of vars it likely should be taking from Chat instead (e.g. stream:bool)

@dillonroach dillonroach requested a review from pmeier June 13, 2024 08:50
@dillonroach dillonroach mentioned this pull request Jun 13, 2024
@dillonroach
Copy link
Copy Markdown
Author

This PR needs to coordinate with #438 - wherever the final pattern lands there, I need to integrate in the generate() and answer()(_stream()) ingest points. This is also modulo however we expect to pass prompt str or messages, etc, through the pre-process stage. Will make this a point of conversation in our meet Monday.

@dillonroach
Copy link
Copy Markdown
Author

@pmeier - I'd already built out most of the assistant examples, so just did them all this pass- but we don't have to discuss them all for first round; lets just pick _anthropic and _openai and see how those are looking specifically.

In the interest of helping this merge quickly, I'll also pull out exl2 for another PR that's just about integrating that as an assistant on its own.

Comment thread ragna/assistants/_openai.py Outdated
Comment thread ragna/assistants/_anthropic.py Outdated
Comment thread ragna/assistants/_anthropic.py Outdated
@dillonroach
Copy link
Copy Markdown
Author

@pmeier - seems like at this point it makes more sense to rebase the changes from latest corpus-dev unless that's causing trouble somewhere - if that works I'll make that PR and then close this one pointing to the new version.

Copy link
Copy Markdown
Member

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dillonroach I pushed 8e120c6. The two main changes I made are

  1. Fix the return types of .generate() for all assistants. It now returns an async iterator of the full data the API returns. Since this function is aimed as a general building block for users, we shouldn't assume they only want the text.

    However, we can make this assumption for .answer(). Meaning, this function calls .generate() and extracts the text.

  • The _render_prompt and corresponding functions treated incoming messages as dictionaries. ragna.core.Message is a proper object and thus key access will not work. Furthermore, message.role is the ragna.core.MessageRole enum and cannot be used directly. Instead, you need to use message.role.value if you want to have the string.

With my commit, CI should be green. Note that I base my review on the latest commit, i.e. including my cleanup. Thus, please make sure to review what I have pushed, because the comments otherwise might make no sense.

Comment thread ragna/assistants/_ai21labs.py Outdated
Comment thread ragna/assistants/_anthropic.py
"preamble_override": self._make_preamble(),
"message": prompt,
"preamble_override": system_prompt,
"message": self._render_prompt(prompt),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the message indeed can only be a single string here, the endpoint has a chat_history parameter. And that one takes the previous message similar to all other assistants.

I would let _render_prompt return a tuple of the chat history and the current user message, e.g.

chat_history, message = self._render_prompt(prompt)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflicted on this one - seems like this would fit specifically with the pre-process pass, or this puts this one specific assistant ahead of the others in terms of capabilities - it certainly doesn't hurt anything, so happy to do so here, but also see how we might want to see what a pre-process stage looks like for all assistants and implement in one go.

Comment thread ragna/assistants/_google.py
Comment thread tests/assistants/test_api.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants