ZetaCloud Documentation
Overview
ZetaCloud is a versatile command-line tool that simplifies the process of training or fine-tuning machine learning models on remote GPU clusters. With just a few commands, you can effortlessly manage your tasks and harness the computational power of various GPUs. This comprehensive documentation will guide you through every aspect of the ZetaCloud CLI, from installation to advanced usage.
Table of Contents
- Installation
- ZetaCloud CLI
- Options
- Basic Usage
- Example 1: Starting a Task
- Example 2: Stopping a Task
- Example 3: Checking Task Status
- Advanced Usage
- Example 4: Cluster Selection
- Example 5: Choosing the Cloud Provider
- Additional Information
- References
1. Installation
Getting started with ZetaCloud is quick and straightforward. Follow these steps to set up ZetaCloud on your machine:
-
Open your terminal or command prompt.
-
Install the
zetascale
package usingpip
:
- After a successful installation, you can access the ZetaCloud CLI by running the following command:
This command will display a list of available options and basic usage information for ZetaCloud.
2. ZetaCloud CLI
The ZetaCloud Command-Line Interface (CLI) provides a set of powerful options that enable you to manage tasks on GPU clusters effortlessly. Below are the available options:
Options
-h, --help
: Display the help message and exit.-t TASK_NAME, --task_name TASK_NAME
: Specify the name of your task.-c CLUSTER_NAME, --cluster_name CLUSTER_NAME
: Specify the name of the cluster you want to use.-cl CLOUD, --cloud CLOUD
: Choose the cloud provider (e.g., AWS, Google Cloud, Azure).-g GPUS, --gpus GPUS
: Specify the number and type of GPUs required for your task.-f FILENAME, --filename FILENAME
: Provide the filename of your Python script or code.-s, --stop
: Use this flag to stop a running task.-d, --down
: Use this flag to terminate a cluster.-sr, --status_report
: Check the status of your task.
3. Basic Usage
ZetaCloud's basic usage covers essential tasks such as starting, stopping, and checking the status of your tasks. Let's explore these tasks with examples.
Example 1: Starting a Task
To start a task, you need to specify the Python script you want to run and the GPU configuration. Here's an example command:
In this example:
- -f train.py
indicates that you want to run the Python script named train.py
.
- -g A100:8
specifies that you require 8 NVIDIA A100 GPUs for your task.
Example 2: Stopping a Task
If you need to stop a running task, you can use the following command:
This command will stop the currently running task.
Example 3: Checking Task Status
To check the status of your task, use the following command:
This command will provide you with a detailed status report for your active task.
4. Advanced Usage
ZetaCloud also offers advanced options that allow you to fine-tune your tasks according to your specific requirements.
Example 4: Cluster Selection
You can select a specific cluster for your task by providing the cluster name with the -c
option:
This command will run your task on the cluster named my_cluster
.
Example 5: Choosing the Cloud Provider
ZetaCloud supports multiple cloud providers. You can specify your preferred cloud provider using the -cl
option:
This command will execute your task on a cloud provider's infrastructure, such as AWS.
5. Additional Information
-
ZetaCloud simplifies the process of utilizing GPU clusters, allowing you to focus on your machine learning tasks rather than infrastructure management.
-
You can easily adapt ZetaCloud to various cloud providers, making it a versatile tool for your machine learning needs.