vllm.model_executor.layers.pooler
PoolingFn
module-attribute
¶
PoolingFn = Callable[
[Union[Tensor, list[Tensor]], PoolingMetadata],
Union[Tensor, list[Tensor]],
]
AllPool
¶
Bases: PoolingMethod
Source code in vllm/model_executor/layers/pooler.py
forward_all
¶
Source code in vllm/model_executor/layers/pooler.py
forward_one
¶
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
Source code in vllm/model_executor/layers/pooler.py
BasePoolerActivation
¶
Source code in vllm/model_executor/layers/pooler.py
forward
abstractmethod
¶
Source code in vllm/model_executor/layers/pooler.py
CLSPool
¶
Bases: PoolingMethod
Source code in vllm/model_executor/layers/pooler.py
forward_all
¶
Source code in vllm/model_executor/layers/pooler.py
forward_one
¶
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
Source code in vllm/model_executor/layers/pooler.py
ClassifierPooler
¶
Bases: Module
A pooling layer for classification tasks.
This layer does the following: 1. Applies a classification layer to the hidden states. 2. Optionally applies a pooler layer. 3. Applies an activation function to the output. In the case of classification models it is either sigmoid or softmax. In the case of scoring models, the same behavior is configuration dependent, as in the sentence-transformers library.
Source code in vllm/model_executor/layers/pooler.py
708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 | |
classification_act_fn
instance-attribute
¶
classification_act_fn = (
get_classification_activation_function(hf_config)
if act_fn is None
else act_fn
)
cross_encoder_act_fn
instance-attribute
¶
cross_encoder_act_fn = (
get_cross_encoder_activation_function(hf_config)
if act_fn is None
else act_fn
)
__init__
¶
__init__(
config: ModelConfig,
pooling: PoolingFn,
classifier: ClassifierFn,
act_fn: Optional[PoolerActivation] = None,
) -> None
Source code in vllm/model_executor/layers/pooler.py
forward
¶
forward(
hidden_states: Union[Tensor, list[Tensor]],
pooling_metadata: PoolingMetadata,
) -> PoolerOutput
Pools sentence pair scores from the hidden_states.
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
Source code in vllm/model_executor/layers/pooler.py
LambdaPoolerActivation
¶
LastPool
¶
Bases: PoolingMethod
Source code in vllm/model_executor/layers/pooler.py
forward_all
¶
Source code in vllm/model_executor/layers/pooler.py
forward_one
¶
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
Source code in vllm/model_executor/layers/pooler.py
MeanPool
¶
Bases: PoolingMethod
Source code in vllm/model_executor/layers/pooler.py
forward_all
¶
Source code in vllm/model_executor/layers/pooler.py
forward_one
¶
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
Source code in vllm/model_executor/layers/pooler.py
Pooler
¶
The interface required for all poolers used in pooling models in vLLM.
Source code in vllm/model_executor/layers/pooler.py
forward
abstractmethod
¶
forward(
hidden_states: Union[list[Tensor], Tensor],
pooling_metadata: PoolingMetadata,
) -> PoolerOutput
from_config_with_defaults
staticmethod
¶
from_config_with_defaults(
pooler_config: PoolerConfig,
pooling_type: PoolingType,
normalize: bool,
softmax: bool,
step_tag_id: Optional[int] = None,
returned_token_ids: Optional[list[int]] = None,
) -> Pooler
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
Construct the pooling parameters to use for a task,
or None if the task is not supported.
PoolerActivation
¶
Bases: BasePoolerActivation
Source code in vllm/model_executor/layers/pooler.py
PoolerClassify
¶
Bases: PoolerActivation
Source code in vllm/model_executor/layers/pooler.py
forward_chunk
¶
Source code in vllm/model_executor/layers/pooler.py
PoolerHead
¶
Bases: Module
Source code in vllm/model_executor/layers/pooler.py
__init__
¶
__init__(activation: PoolerActivation) -> None
forward
¶
forward(
pooled_data: Union[list[Tensor], Tensor],
pooling_metadata: PoolingMetadata,
)
Source code in vllm/model_executor/layers/pooler.py
from_config
classmethod
¶
from_config(
pooler_config: ResolvedPoolingConfig,
) -> PoolerHead
Source code in vllm/model_executor/layers/pooler.py
PoolerIdentity
¶
Bases: PoolerActivation
Source code in vllm/model_executor/layers/pooler.py
PoolerNormalize
¶
Bases: PoolerActivation
Source code in vllm/model_executor/layers/pooler.py
PoolerScore
¶
PoolingMethod
¶
Source code in vllm/model_executor/layers/pooler.py
forward
¶
forward(
hidden_states: Union[Tensor, list[Tensor]],
pooling_metadata: PoolingMetadata,
) -> Union[list[Tensor], Tensor]
Source code in vllm/model_executor/layers/pooler.py
forward_all
abstractmethod
¶
forward_one
abstractmethod
¶
Note
prompt_len=None means prompt_len=len(hidden_states).
Source code in vllm/model_executor/layers/pooler.py
from_pooling_type
staticmethod
¶
from_pooling_type(
pooling_type: PoolingType,
) -> PoolingMethod
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
abstractmethod
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
PoolingType
¶
Bases: IntEnum
Enumeration for different types of pooling methods.
Source code in vllm/model_executor/layers/pooler.py
ResolvedPoolingConfig
dataclass
¶
Source code in vllm/model_executor/layers/pooler.py
__init__
¶
__init__(
pooling_type: PoolingType,
normalize: bool,
softmax: bool,
step_tag_id: Optional[int],
returned_token_ids: Optional[list[int]],
) -> None
from_config_with_defaults
classmethod
¶
from_config_with_defaults(
pooler_config: PoolerConfig,
pooling_type: PoolingType,
normalize: bool,
softmax: bool,
step_tag_id: Optional[int] = None,
returned_token_ids: Optional[list[int]] = None,
) -> ResolvedPoolingConfig
Source code in vllm/model_executor/layers/pooler.py
SimplePooler
¶
Bases: Pooler
A layer that pools specific information from hidden states.
This layer does the following:
1. Extracts specific tokens or aggregates data based on pooling method.
2. Normalizes output if specified.
3. Returns structured results as PoolerOutput.
Source code in vllm/model_executor/layers/pooler.py
__init__
¶
__init__(pooling: PoolingMethod, head: PoolerHead) -> None
forward
¶
forward(
hidden_states: Union[Tensor, list[Tensor]],
pooling_metadata: PoolingMetadata,
) -> PoolerOutput
Source code in vllm/model_executor/layers/pooler.py
from_config
classmethod
¶
from_config(
pooler_config: ResolvedPoolingConfig,
) -> SimplePooler
Source code in vllm/model_executor/layers/pooler.py
from_config_with_defaults
classmethod
¶
from_config_with_defaults(
pooler_config: PoolerConfig,
pooling_type: PoolingType,
normalize: bool,
softmax: bool,
) -> SimplePooler
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
StepPooler
¶
Bases: Pooler
Source code in vllm/model_executor/layers/pooler.py
540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 | |
__init__
¶
__init__(
head: PoolerHead,
*,
step_tag_id: Optional[int] = None,
returned_token_ids: Optional[list[int]] = None,
) -> None
Source code in vllm/model_executor/layers/pooler.py
extract_states
¶
extract_states(
hidden_states: Union[Tensor, list[Tensor]],
pooling_metadata: PoolingMetadata,
) -> Union[list[Tensor], Tensor]
Source code in vllm/model_executor/layers/pooler.py
forward
¶
forward(
hidden_states: Union[Tensor, list[Tensor]],
pooling_metadata: PoolingMetadata,
) -> PoolerOutput
Source code in vllm/model_executor/layers/pooler.py
from_config
classmethod
¶
from_config(
pooler_config: ResolvedPoolingConfig,
) -> StepPooler
Source code in vllm/model_executor/layers/pooler.py
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
Source code in vllm/model_executor/layers/pooler.py
get_prompt_token_ids
¶
get_prompt_token_ids(
pooling_metadata: PoolingMetadata,
) -> list[Tensor]
Source code in vllm/model_executor/layers/pooler.py
VisionPooler
¶
Bases: Pooler
Source code in vllm/model_executor/layers/pooler.py
__init__
¶
__init__(config: ModelConfig)
forward
¶
forward(
hidden_states: Tensor, pooling_metadata: PoolingMetadata
) -> PoolerOutput
Source code in vllm/model_executor/layers/pooler.py
from_config
classmethod
¶
from_config(model_config: ModelConfig) -> VisionPooler
get_pooling_params
¶
get_pooling_params(
task: PoolingTask,
) -> Optional[PoolingParams]
build_output
¶
build_output(all_data: Tensor) -> PoolerOutput
get_classification_activation_function
¶
get_cross_encoder_activation_function
¶
Source code in vllm/model_executor/layers/pooler.py
get_prompt_lens
¶
get_prompt_lens(
hidden_states: Union[Tensor, list[Tensor]],
pooling_metadata: PoolingMetadata,
) -> Tensor
Source code in vllm/model_executor/layers/pooler.py
mean_pool_with_position_kernel
¶
mean_pool_with_position_kernel(
hidden_states_ptr,
output_ptr,
seq_start,
seq_len,
hidden_size,
pool_start,
pool_end,
BLOCK_SIZE: constexpr,
)
Triton kernel to perform mean pooling over a specified token range.