vllm.model_executor.layers.fused_moe.moe_torch_iterative
fused_moe
¶
fused_moe(
hidden_states: Tensor,
w1: Tensor,
w2: Tensor,
gating_output: Tensor,
topk: int,
global_num_experts: int,
expert_map: Tensor = None,
renormalize: bool = False,
) -> Tensor
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hidden_states
|
Tensor
|
[*, hidden_size] |
required |
w1
|
Tensor
|
[num_experts, intermediate_size * 2, hidden_size] |
required |
w2
|
Tensor
|
[num_experts, hidden_size, intermediate_size] |
required |
gating_output
|
Tensor
|
[*, num_experts] |
required |
expert_map
|
Tensor
|
[num_experts] |
None
|