Major Features and Improvements
Train/Eval/Predict/Export
- Support grad clipping for dense params in #424
- Refactor AOTInductor export with split model in #394
- Support env configs for reproducibility in #361
- Support ignore restore optimizer option for train_eval in #389
Model
- Add PE-LTR model in #381
- Add WuKong model in #372
- Add PEPNet model in #402
- Improve DlrmHSTU model in #352 #395 #393 #359
- support num_class > 1
- support descending order sequence
- support jagged label
- support time_bucket_increments in PositionEncoder
- DLRM and WuKong model support only one sparse group in #385
Embedding
- Add AdmissionStrategy support for DynamicEmbedding in #362
- Add storage estimate for dynamic embedding kv counter in #391
Feature
- Support sequence cross features in #375
- Support convert compatible feature configs of EasyRec in #392
Dataset
- Add Kafka dataset with checkpoint support in #401 #408 #413
- Add checkpointable Parquet dataset in #410
- Add checkpointable ODPS dataset in #409
- Support input fields str in #412
Optimizer
- Support initial_accumulator_value for FusedSparseAdagradOptimizer & add additional optimizer configuration options in #382
Upgrade
- Upgrade torchrec to v1.5.0 in #405
Note
For TorchEasyRec 1.1.x, you should use Docker image version 1.1.
- For the GPU version (CUDA 12.9) with tensorrt:
mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:1.1-cu129- PyTorch: v2.10 CUDA: v12.9 FBGEMM: v1.5.0 TorchRec: v1.5.0 Python: v3.11
- For the GPU version (CUDA 12.6) w/o tensorrt:
mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:1.1-cu126- PyTorch: v2.10 CUDA: v12.6 FBGEMM: v1.5.0 TorchRec: v1.5.0 Python: v3.11
- For the CPU version:
mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:1.1-cpu- PyTorch: v2.10 FBGEMM: v1.5.0 TorchRec: v1.5.0 Python: v3.11
Bug Fixes and Other Changes
- [bugfix] fix session_status check error of OdpsWriter by @tiankongdeguiji in #351
- [feat] add MTGR style DlrmHSTU config doc by @tiankongdeguiji in #353
- [bugfix] sampler add raise parse kv error by @chengaofei in #355
- [bugfix] make fsspec disable by default by @tiankongdeguiji in #356
- [feat] add faq doc - The "kv" feature key contains ":" character by @yanzhen1233 in #357
- [feat] bump up pyfg to 1.0.0 by @tiankongdeguiji in #363
- [feat] update local tutorial tdm doc with FG_DAG mode by @asdfasdfsdfas in #365
- [bugfix] fix input names of custom sequence feature when use pyfg 1.0.0 by @tiankongdeguiji in #368
- [feat] update dlc tutorial doc with oss mount by @asdfasdfsdfas in #373
- [bugfix] fix dynamicemb is_sparse for custom/lookup/match feature by @tiankongdeguiji in #376
- [bugfix] fix torch.full error of apply_split_helper for uvm embedding kernel by @tiankongdeguiji in #383
- [feat] use cpu npoc_per_node_is_1 by @chengaofei in #384
- [bugfix] use KVCounter initialization fix of dynamicemb by @tiankongdeguiji in #387
- [feat] add warning for default_value in vocab_list or vocab_dict by @tiankongdeguiji in #388
- [bugfix] make sequence related config optional by @tiankongdeguiji in #390
- [bugfix] fix occasional failure in test_add_timestamp_positional_embeddings_triton by @tiankongdeguiji in #396
- [feat] update custom feature doc by @tiankongdeguiji in #400
- [feat] add odps test quota & fix ecs ram role in test by @tiankongdeguiji in #403
- [bugfix] fix ecs ram role timeout in benchmark by @tiankongdeguiji in #411
- [feat] add ai code review by @tiankongdeguiji in #414
- [feat] refine readme and update doc with new features by @tiankongdeguiji in #416
- [bugfix] fix introduction doc & refine kafka dataset doc by @tiankongdeguiji in #417
- [bugfix] fix search tool hang in readthedoc and doc build warnings by @tiankongdeguiji in #418
- [bugfix] fix kafka dataset test with embed schema by @tiankongdeguiji in #419
- [bugfix] correct typo TRAGET_REPEAT_INTERLEAVE_KEY -> TARGET_REPEAT_INTERLEAVE_KEY by @hobostay in #420
- [feat] add ArrowRecordBatch flink udf usage doc by @tiankongdeguiji in #421
- [feat] unify proto configs w/o using colon by @tiankongdeguiji in #423
- [bugfix] fix ecs ram role error log by @tiankongdeguiji in #426
- [bugfix] fix INPUT_TILE_3_ONLINE=1 and add docs by @tiankongdeguiji in #428
New Contributors
- @asdfasdfsdfas made their first contribution in #365
- @hobostay made their first contribution in #420
Full Changelog: v1.0.0...v1.1.0