AI + Machine Learning File Storage Solution

Why JuiceFS？

Hundreds of billions of files

JuiceFS can manage up to hundreds of billions of files in a single volume. This capability has been proven in multiple enterprises' production environments, making it ideal for large-scale machine learning datasets.

Low-latency & high-throughput

For machine learning training and inference, JuiceFS delivers read throughput of hundreds of gigabytes per second, handling hundreds of thousands of files with millisecond-level metadata response time. With flexible cache configurations, JuiceFS provides virtually unlimited aggregate throughput.

Cloud-native design

Designed specifically for cloud environments, JuiceFS can be deployed on global public clouds and seamlessly integrates into existing cloud infrastructures. This meets diverse platform and regional requirements.

Multi-cloud file systems

When GPU resources are distributed across regions, ensuring on-demand remote data access and addressing bandwidth limitations is critical. JuiceFS' mirror file system ensures consistent and localized data access worldwide.

Enhanced security

JuiceFS offers data isolation and security for shared storage systems across different teams, featuring capabilities such as token-based mounting and access control, Linux file permissions, POSIX ACLs, subdirectory mounting, capacity and inode quotas, as well as traffic QoS (Quality of Service).

Cost advantage

JuiceFS uses object storage to offer elastic scalability of storage capacity and significantly reduce storage costs. Its flexible architecture minimizes maintenance, and migration expenses.

Feature Overview

In-house metadata

JuiceFS' metadata engine is horizontally scalable. It efficiently manages storage for hundreds of billions of files within a single namespace.

Distributed cache

Multiple clients share the same cache data to enhance performance, especially for repeated access patterns in machine learning training and evaluation.

POSIX compatibility

You can use it like a local file system, seamlessly integrating with existing applications without disrupting application operations.

Superior performance

Optimized for both AI and machine learning workloads, including large and small files access.

JuiceFS CSI Driver

Implementing the interface between container orchestration systems and JuiceFS. In K8s, JuiceFS can provide persistent volumes for Pods.

Mirror file systems

Creating one or more complete mirrors of the file system with consistent content.

How Lepton AI Cut Cloud Storage Costs by 98% for AI Workflows with JuiceFS

Using JuiceFS' high-performance caching mechanism, we significantly accelerated file operations and minimized latency issues caused by object storage. Compared to the previously used Amazon Elastic File System (EFS), our storage costs have been reduced by 96.7% to 98%.[Details]

NAVER, Korea's No.1 Search Engine, Chose JuiceFS over Alluxio for AI Storage

Facing storage challenges in our AI platform, we evaluated various options, including public cloud platforms, Alluxio, and high-performance dedicated storage solutions. Finally, we chose JuiceFS. [Details]