SinusoidalEmbeddings
Documentation¶
Table of Contents¶
- Introduction
- Purpose and Functionality
- Class:
SinusoidalEmbeddings
- Initialization
- Parameters
- Forward Method
- Function:
rotate_half
- Function:
apply_rotary_pos_emb
- Usage Examples
- Using the
SinusoidalEmbeddings
Class - Using the
rotate_half
Function - Using the
apply_rotary_pos_emb
Function - Additional Information
- Sinusoidal Positional Embeddings
- Rotary Positional Embeddings
- References
1. Introduction ¶
Welcome to the Zeta documentation! This documentation provides an in-depth explanation of the SinusoidalEmbeddings
class and related functions. Zeta is a PyTorch library that aims to simplify complex deep learning tasks. In this documentation, we will explore how SinusoidalEmbeddings
and associated functions work and how they can be applied to various scenarios.
2. Purpose and Functionality ¶
The SinusoidalEmbeddings
class is designed to generate sinusoidal positional embeddings for sequences in transformer-based models. These embeddings are essential for self-attention mechanisms to understand the positions of elements within a sequence. Additionally, this documentation covers the rotate_half
and apply_rotary_pos_emb
functions, which help apply positional embeddings to input data.
3. Class: SinusoidalEmbeddings
¶
The SinusoidalEmbeddings
class generates sinusoidal positional embeddings. It provides flexibility in configuring the embeddings and can be used to generate both basic sinusoidal embeddings and rotary positional embeddings.
Initialization ¶
To create an instance of the SinusoidalEmbeddings
class, you need to specify the following parameters:
Parameters ¶
-
dim
(int): The dimensionality of the embeddings. -
scale_base
(float or None, optional): The scale base for rotary positional embeddings. Must be defined ifuse_xpos
is set toTrue
. Default isNone
. -
use_xpos
(bool, optional): Whether to use positional information. IfTrue
, positional embeddings will be generated. Default isFalse
.
Forward Method ¶
The forward
method of the SinusoidalEmbeddings
class generates sinusoidal positional embeddings based on the input sequence length. It returns two tensors: frequency embeddings and scale embeddings.
4. Function: rotate_half
¶
The rotate_half
function is used to rotate input data by 180 degrees along the last dimension. It is a useful operation when working with rotary positional embeddings.
Parameters ¶
x
(Tensor): The input tensor to be rotated.
Usage Example ¶
import torch
from zeta import rotate_half
# Create an input tensor
x = torch.randn(2, 3, 4)
# Rotate the input tensor
rotated_x = rotate_half(x)
5. Function: apply_rotary_pos_emb
¶
The apply_rotary_pos_emb
function applies rotary positional embeddings to input query and key tensors. It takes care of the angular transformations needed for rotary embeddings.
Parameters ¶
-
q
(Tensor): The query tensor to which rotary positional embeddings will be applied. -
k
(Tensor): The key tensor to which rotary positional embeddings will be applied. -
freqs
(Tensor): The frequency embeddings generated by theSinusoidalEmbeddings
class. -
scale
(Tensor or float): The scale embeddings for rotary positional embeddings.
Usage Example ¶
import torch
from zeta import apply_rotary_pos_emb
# Create query and key tensors
q = torch.randn(2, 3, 4)
k = torch.randn(2, 3, 4)
# Generate frequency and scale embeddings using SinusoidalEmbeddings
# Apply rotary positional embeddings
q_emb, k_emb = apply_rotary_pos_emb(q, k, freqs, scale)
6. Usage Examples ¶
Let's explore some usage examples of the SinusoidalEmbeddings
class and associated functions to understand how they can be used effectively.
Using the SinusoidalEmbeddings
Class ¶
import torch
from zeta import SinusoidalEmbeddings
# Create an instance of SinusoidalEmbeddings
positional_embedding = SinusoidalEmbeddings(dim=512, use_xpos=True, scale_base=1000)
# Create an input sequence tensor
sequence = torch.randn(1, 10, 512)
# Generate positional embeddings
freqs, scale = positional_embedding(sequence)
Using the rotate_half
Function ¶
This example demonstrates how to use the rotate_half
function:
import torch
from zeta.nn import rotate_half
# Create an input tensor
x = torch.randn(2, 3, 4)
# Rotate the input tensor
rotated_x = rotate_half(x)
Using the apply_rotary_pos_emb
Function ¶
This example demonstrates how to apply rotary positional embeddings using the apply_rotary_pos_emb
function:
import torch
from zeta.nn import rotate_half
# Create query and key tensors
q = torch.randn(2, 3, 4)
k = torch.randn(2, 3, 4)
# Generate frequency and scale embeddings using SinusoidalEmbeddings
# Apply rotary positional embeddings
q_emb, k_emb = apply_rotary_pos_emb(q, k, freqs, scale)
7. Additional Information ¶
Sinusoidal Positional Embeddings <a name="sinusoidal¶
-positional-embeddings">
Sinusoidal positional embeddings are essential for transformer-based models to understand the positions of elements within a sequence. They provide information about the order and location of tokens in the input sequence.
Rotary Positional Embeddings ¶
Rotary positional embeddings are a specialized type of positional embedding that are particularly useful in some transformer variants. They involve angular transformations to capture positional information in a unique way.
8. References ¶
For further information on positional embeddings in transformers and related concepts, you can refer to the following resources:
-
Attention Is All You Need - The original transformer paper introducing positional embeddings.
-
Rotary Position Embeddings - A research paper discussing rotary positional embeddings and their advantages.
-
PyTorch Documentation - Official PyTorch documentation for related functions and modules.
This documentation provides a comprehensive overview of the Zeta library's SinusoidalEmbeddings
class and associated functions. It aims to help you understand the purpose, functionality, and usage of these components for positional embeddings in transformer-based models.