Module Name: MMFusionFFN¶
Overview¶
The MMFusionFFN
module represents a positionwise feedforward layer and is used in the context of multi-modal image and text processing.
Class Definition¶
MMFusionFFN(dim, hidden_dim, dropout=0.0)
Args¶
Name | Type | Description | Default |
---|---|---|---|
dim | int | Input dimension | - |
hidden_dim | int | Hidden dimension | - |
output_dim | int | Output dimension | - |
dropout | float | Dropout probability. | 0.1 |
Functionality and Usage¶
The MMFusionFFN
module is a subclass of the nn.Module
class and contains a forward
method which computes the output of the positionwise feedforward layer.
The method performs the following operations: 1. Apply layer normalization to the input tensor. 2. Pass the resulting tensor through a linear transformation (fully connected layer) with a SiLU (Sigmoid Linear Unit) activation function. 3. Apply dropout to the tensor. 4. Repeat steps 2 and 3 with a second fully connected layer. 5. Return the output tensor.
Usage Examples¶
import torch
from torch import nn
from zeta.nn import MMFusionFFN
# Define the input and hidden dimensions
dim = 512
hidden_dim = 1024
output_dim = 512
dropout = 0.1
# Create an instance of MMFusionFFN
ffn = MMFusionFFN(dim, hidden_dim, output_dim, dropout)
# Example 1 - Forward pass with random input data
input_data = torch.randn(
5, 32, dim
) # Random input data of shape (5, 32, dim)
output = ffn(input_data)
print(output.shape) # Output tensor shape
# Example 2 - Create an instance with default dropout
ffn_default_dropout = MMFusionFFN(dim, hidden_dim, output_dim)
# Example 3 - Forward pass with another input data
input_data2 = torch.randn(
8, 16, dim
) # Random input data of shape (8, 16, dim)
output2 = ffn_default_dropout(input_data2)
print(output2.shape) # Output tensor shape
Additional Information and Tips¶
- The
MMFusionFFN
module is commonly used in multimodal machine learning applications to process multi-dimensional input data from different modalities, such as image and text. - The most important parameters to consider when creating an instance of
MMFusionFFN
aredim
andhidden_dim
. These parameters can be adjusted based on the specifics of the input data and the desired level of transformation. - The
dropout
parameter controls the probability of an element to be zeroed in the forward pass, which can help prevent overfitting.
References and Resources¶
- PyTorch Documentation: nn.Module
- Hugging Face Documentation: SiLU Activation Function
This comprehensive documentation provides a detailed overview of the MMFusionFFN
module, including its purpose, architecture, usage examples, and additional information. Developers can now use this documentation to effectively utilize the module in their applications.
The examples illustrate how to create instances of MMFusionFFN
, perform forward passes, and handle different input shapes, providing a practical guide for utilizing the module. Additionally, important attributes, such as dim
, hidden_dim
, and dropout
, are explained in the class definition table for easy reference and understanding.