TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]

Thank you for your contribution. I encountered the following error when training with toy data:

TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]

I read online that the following reasons may be the cause:
1. The maximum length of the tokenizer is not set;
2. There are blank lines in the jsonl file;
3. The higher version transformer library is incompatible;
4. There are Nan values ​​in the data.
However, I tried the solutions corresponding to the above 4 reasons, and this error is still reported. I want to know why. Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]] #48

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]] #48

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions