Getting Started with the JuiceFS Python SDK

2024-10-30
Herald Yu

As more users turn to containerized cloud computing for AI model training, a common issue arises: these environments often lack the privileges needed to mount JuiceFS using the Filesystem in Userspace (FUSE) module. This limitation can restrict access to the JuiceFS file system inside containers.

To address this, JuiceFS Enterprise Edition 5.1 introduces a Python SDK. This SDK allows users to bypass the need for FUSE, enabling direct programmatic access to JuiceFS. It also makes it easier to integrate JuiceFS into your applications for more flexible usage.

In this article, we’ll introduce how to use the Python SDK. This feature is currently in beta. You’re welcome to try it out and share your feedback with us.

Installation and Initialization

The JuiceFS Python SDK requires Python 3.8 or later. You can install it using the following command:

pip install https://static.juicefs.com/misc/juicefs-5.1.1-py3-none-any.whl

To start using JuiceFS, initialize a client object by providing the relevant file system details:

import juicefs

# Initialize your JuiceFS client.
jfs = juicefs.Client('volume-name',          # Name of the file system
                    token='xxx',             # File system token
                    access_key='your-ak',    # Object storage's access key
                    secret_key='your-sk')    # Object storage's secret key

Basic file operations

The JuiceFS Python SDK is designed with a similar interface to Python's built-in os module, making it easy for Python developers to get started quickly.

Here are some common operations you can perform:

  • List files: Use the listdir method to list files in a directory.
  • Create directories: Use the makedirs method to create a new directory.
  • Check file existence: Use the exists method to check if a file or directory exists.
  • File reading and writing: Use the open method to read and write files.
  • Delete files: Use the remove method to delete a file.

Below is an example demonstrating basic commands:

# List files in a directory.
jfs.listdir('/')

# Create a directory.
jfs.makedirs("/files")

# Write to a file.
with jfs.open("/files/hello.txt", "w") as f:
    f.write("hello")

# Read a file.
with jfs.open("/files/hello.txt") as f:
    data = f.read()
    print(data)

# Delete a file.
jfs.remove("/files/hello.txt")

Advanced features

The JuiceFS Python SDK also supports advanced operations such as modifying file permissions, creating and reading symbolic links, and setting or getting extended attributes. For detailed usage examples and API references, see the JuiceFS Cloud Service documentation.

Conclusion

The JuiceFS Python SDK offers a new way to efficiently manage file systems, especially in environments with limited privileges, such as containers, serverless applications, and AI model training. It provides greater flexibility for file operations, and we hope this tool will enhance your workflow.

If you have any questions for this article, feel free to join JuiceFS discussions on GitHub and community on Slack.

Author

Herald Yu
Technical Writer at Juicedata

Related Posts

How JuiceFS Achieves Consistency and Low-Latency Data Distribution in Multi-Cloud Architectures

2025-01-22
Learn how JuiceFS Enterprise Edition enhances large-scale AI training by providing efficient cross-…

Automated Cache Management: JuiceFS Enterprise Edition Introduces Cache Group Operator

2025-01-16
Learn how to install, configure, and manage cache groups using the JuiceFS Cache Group Operator wit…

Database Release and End-to-End Testing: Bringing Modern Software Development Best Practices to the Data World

2024-12-04
Learn how Jerry, a US technology company, uses ClickHouse database cloning with JuiceFS snapshots t…

Deep Dive into JuiceFS Data Synchronization and Consistency in Multi-Cloud Architectures

2024-11-06
Learn how the JuiceFS mirror file system implements read and write operations, enabling efficient d…