MLP
Documentation¶
Table of Contents¶
- Introduction
- Purpose and Functionality
- Class:
MLP
- Initialization
- Parameters
- Forward Method
- Usage Examples
- Using the
MLP
Class - Additional Information
- References
1. Introduction ¶
Welcome to the Zeta documentation! In this documentation, we will explore the MLP
class, a part of the Zeta library. The MLP
class is designed to implement a Multi-Layer Perceptron (MLP) neural network module. This documentation provides a comprehensive understanding of the purpose, functionality, and usage of the MLP
class.
2. Purpose and Functionality ¶
The MLP
class implements a Multi-Layer Perceptron (MLP) module, a type of artificial neural network commonly used in deep learning. MLPs are composed of multiple layers of fully connected neurons and are known for their ability to approximate complex functions.
The key features of the MLP
class include:
-
Configurable architecture: You can specify the input and output dimensions, the expansion factor for hidden layers, the number of hidden layers, and whether to apply layer normalization.
-
Activation functions: The MLP uses the Scaled Exponential Linear Unit (SiLU) activation function, which has been shown to improve training dynamics.
-
Optional layer normalization: You can enable or disable layer normalization for the hidden layers.
-
Flexibility: MLPs can be used for a wide range of tasks, including regression, classification, and function approximation.
3. Class: MLP
¶
The MLP
class implements the Multi-Layer Perceptron (MLP) neural network module. Let's delve into its details.
Initialization ¶
To create an instance of the MLP
class, you need to specify the following parameters:
Parameters ¶
-
dim_in
(int): The dimensionality of the input tensor. -
dim_out
(int): The dimensionality of the output tensor. -
expansion_factor
(float, optional): The expansion factor for the hidden dimension. Default is2.0
. -
depth
(int, optional): The number of hidden layers. Default is2
. -
norm
(bool, optional): Whether to apply layer normalization to the hidden layers. Default isFalse
.
Forward Method ¶
The forward
method of the MLP
class performs the forward pass of the MLP module. It takes an input tensor and returns the output tensor.
def forward(x):
"""
Forward pass of the MLP module.
Args:
x (torch.Tensor): The input tensor.
Returns:
torch.Tensor: The output tensor.
"""
return self.net(x.float())
4. Usage Examples ¶
Let's explore how to use the MLP
class effectively in various scenarios.
Using the MLP
Class ¶
Here's how to use the MLP
class to create and apply an MLP neural network:
import torch
from zeta.nn import MLP
# Create an instance of MLP
mlp = MLP(dim_in=256, dim_out=10, expansion_factor=4.0, depth=3, norm=True)
# Create an input tensor
x = torch.randn(32, 256)
# Apply the MLP
output = mlp(x)
# Output tensor
print(output)
5. Additional Information ¶
Multi-Layer Perceptrons (MLPs) are versatile neural network architectures that can be adapted to various tasks. Here are some additional notes:
-
Hidden Layer Configuration: You can customize the architecture of the MLP by adjusting parameters such as
expansion_factor
anddepth
. These parameters control the size and depth of the hidden layers. -
Layer Normalization: Layer normalization can help stabilize training and improve convergence, especially in deep networks. You can enable it by setting the
norm
parameter toTrue
. -
Activation Function: The MLP uses the Scaled Exponential Linear Unit (SiLU) activation function, which is known for its smooth gradients and improved training dynamics.
6. References ¶
For further information on Multi-Layer Perceptrons (MLPs) and related concepts, you can refer to the following resources:
-
Deep Learning - "Deep Learning" by Ian Goodfellow, Yoshua Bengio, and Aaron Courville. This book provides an in-depth understanding of neural networks, including MLPs.
-
PyTorch Documentation - Official PyTorch documentation for related functions and modules.
This documentation provides a comprehensive overview of the Zeta library's MLP
class. It aims to help you understand the purpose, functionality, and usage of the MLP
class for building Multi-Layer Perceptron neural networks for various tasks.