!1643 [Feature] Support D-side Pre-allocation First Scheduling in LLMDatadist and OmniCache, inherit from connector_v1
- fix a bug
- add missing function
- fix confilcts with latest datadist connector
- Merge remote-tracking branch ‘origin/master’ into master_d2p_latest
- fix confilcts with latest datadist connector
- fix bug
- update omni/accelerators/pd/llmdatadist_connector_d2p.py.
- fix bugs
- polish the code of llmdatadist_connector_d2p.py
- polish the code of omni_cache_connector_d2p.py
- polish the code of omni_cache_connector_d2p.py, remove the classes tha…
- Merge branch ‘msater_d2p_latest’ of https://gitee.com/yyaoaj/omniinfer…
- polish the code of omni_cache_connector_d2p.py, remove the classes tha…
- polish the code of llmdatadist_connector_d2p.py, remove the classes t…
- fix a bug in llmdatadist_connector_v2.py and rename the connectors for d2p
- revise omni_proxy based on comments for d2p
- add D->P llmdatadist_connector_v2
- enable omni_proxy to support d2p
- support d2p schedule in omni_cache connector
Omni-Infer:基于昇腾的超大规模MoE模型推理加速技术
中文 | View English
社区新闻(更多活动可参考社区活动日历) 🔥
往期活动
Omni-Infer是一套专为昇腾硬件平台定制的强大推理加速工具集,完全兼容业界目前主流的开源大模型推理框架(比如vLLM等),旨在提供高性能、企业级推理能力,具备原生支持且功能集持续扩展。
核心特性
开源社区
有关Omni-Infer社区运作、活动、治理相关的信息,请访问我们的社区运作仓库
High-Level 架构图
快速开始
PD分离快速部署示例请参考指南。如需将Omni_Infer集成到项目中,请参考安装指南和文档获取详细的设置说明和API参考。
贡献指南
我们欢迎您为Omni_Infer贡献代码!请查看贡献指南,并通过Gitee Issues提交拉取请求或问题。
许可证
Omni_Infer基于MIT许可证发布。