Runcrate

Overview

Runcrate is a GPU cloud platform that gives AI developers access to high-performance computing without the complexity of traditional cloud providers. The platform aggregates GPU capacity across multiple providers, offering H100, H200, A100, and RTX 4090 instances through a simple interface. You get a full development environment with VS Code, Jupyter notebooks, and SSH access built in. Deploy in 60 seconds, pay only for active hours, and skip the egress fees and hidden charges that plague larger platforms.

Key Features

Instant GPU Deployment: Spin up H100, A100, or RTX instances in 60 seconds with pre-configured environments and no setup required.
Built-In Development Tools: Access VS Code Server in your browser, Jupyter notebooks, SSH connectivity, and real-time GPU monitoring without installing anything.
Flexible GPU Options: Choose from H100 80GB ($1.54/hr), H200 ($2.41/hr), A100 80GB ($1.06/hr), or RTX 4090 ($0.52/hr) based on your workload needs.
Pay-Per-Hour Billing: Add credits via Stripe and pay only when instances are running, with no expiration on prepaid credits and no minimum commitments.
Custom Configuration: Configure CPU, memory, and storage for each instance, bring your own Docker images, and set environment variables.
Team Collaboration: Share workspaces with team members, collaborate on projects, and manage access with enterprise-grade security.
Reserved GPU Clusters: Request custom quotes for dedicated clusters with high-speed interconnect, specific GPU models, and regional preferences.
Zero Egress Fees: No hidden charges for data transfer, no setup fees, and transparent pricing that stays consistent.

Pros

Significant Cost Savings: Up to 70% cheaper than AWS, GCP, and Azure for equivalent GPU instances.
Quick Setup: Deploy production-ready GPU instances in 60 seconds without infrastructure configuration.
Transparent Pricing: Pay-per-hour billing with no egress fees, no surprise charges, and credits that never expire.
Complete Development Environment: Browser IDE, Jupyter, and monitoring tools included without additional setup.
Flexible Scaling: Start and stop instances anytime to control costs, with billing only for active runtime.

Cons

Aggregated Infrastructure: Relies on aggregating GPU capacity from multiple providers rather than owning infrastructure, which may affect availability.
Technical Focus: The platform is primarily built for developers and engineers rather than non-technical users.
No Reserved Pricing by Default: Billing is based on a pay-as-you-go model, which may not suit teams that prefer fixed monthly costs or reserved pricing models.
Self-Managed Instances: Users must start, stop, and manage instances themselves, a process that may require some knowledge of cloud operations.

Use Cases

Training Large Language Models: Deploy H100 or A100 instances to train transformer models and large neural networks. Access 80GB of high-bandwidth memory for processing massive datasets without worrying about egress costs when moving data between training runs.
Computer Vision Development: Use RTX 4090 instances for GPU-accelerated rendering, 3D modeling, and real-time inference testing. The built-in Jupyter environment lets you prototype vision models and visualize results immediately in your browser.
Production Inference Servers: Launch A100 instances to serve real-time predictions for deployed models. Configure custom CPU and memory settings to optimize for your specific inference workload and scale instances up or down based on traffic.
Data Processing Pipelines: Run GPU-accelerated ETL workflows on large datasets using the pre-configured Python environment. Process terabytes of data with parallel GPU compute, then shut down instances to stop billing immediately.
Research Experimentation: Spin up various GPU types to benchmark model performance across different hardware. Use the hour-by-hour billing to test multiple configurations without committing to long-term contracts or minimums.
Team ML Projects: Collaborate on machine learning projects with shared workspaces and team access controls. Multiple developers can work in the same environment with VS Code Server and Jupyter notebooks, sharing compute resources efficiently.

Frequently Asked Questions

How does Runcrate compare to AWS, GCP, or Azure?

Runcrate offers the same GPU models (H100, A100) at up to 70% lower prices. For example, their H100 costs $1.54/hour compared to AWS p5.2xlarge at $5.12/hour. You also avoid egress fees and get development tools (VS Code, Jupyter) included instead of paying separately.

What GPUs are available on Runcrate?

Runcrate offers NVIDIA H100 80GB, H200, A100 80GB, and RTX 4090 GPUs. All instances support custom CPU, memory, and storage configurations. You can also request custom quotes for reserved clusters with specific GPU models and high-speed interconnect.

How does billing work?

You add credits to your account via Stripe, then pay hourly only when instances are actively running. Credits never expire and there are no minimum commitments. Stop an instance anytime to pause billing. All pricing is transparent with no hidden fees or egress charges.

Can I use my own development environment?

Yes. Runcrate provides VS Code Server and Jupyter notebooks in the browser, but you also get full SSH access and root privileges. You can bring your own Docker images, configure environment variables, and install any tools you need.

How quickly can I deploy a GPU instance?

Deployment takes about 60 seconds. Select your GPU type and configuration, then launch. The environment comes pre-configured with common ML frameworks and tools, so you can start working immediately without setup time.

Is Runcrate suitable for production workloads?

Runcrate supports production inference servers and AI applications with enterprise security features and team collaboration tools. However, as a newer platform aggregating GPU capacity from multiple providers, you should evaluate availability and SLA requirements for your specific use case.

What AI frameworks and models are supported?

Runcrate supports all major frameworks since you have full control over the environment. Common use cases include running LLaMA, Stable Diffusion, and custom ML workloads. You can bring your own Docker images with any framework pre-installed.

Can I reserve GPUs for long-term projects?

Yes. While the standard offering is pay-as-you-go, Runcrate provides custom quotes within 24 hours for reserved GPU clusters. You can specify the number of GPUs, model type, region, and interconnect requirements for dedicated capacity.