Massive-scale data science,
from the comfort of your laptop

# Launch an Arkouda server: ./arkouda_server -nl <number-of-locales>

import arkouda as ak

# connect to the server
ak.connect('localhost', 5555)

# Generate two large arrays
a = ak.random.randint(0,2**32,2**38) # ----> Won't fit on a single machine!
b = ak.random.randint(0,2**32,2**38) #       1TB of random integers.

# add them
c = a + b

# Sort the array and print first 10 elements
c = ak.sort(c)
print(c[0:10])


import numpy as np




# Generate two large arrays
a = np.random.randint(0,2**32,2**28) # ----> smaller to fit on a single machine
b = np.random.randint(0,2**32,2**28)

# add them
c = a + b

# Sort the array and print first 10 elements
c = np.sort(c)
print(c[0:10])

Arkouda v2024.10.02 released!

The new release includes faster Parquet I/O, initial support for sparse matrices, Random module improvements towards numpy alignment, and many bug fixes.

Read the release notes →

Arkouda is…

Fast

Arkouda is powered by Chapel, a programming language built from the ground up to support parallelism and distributed computing. Make the most out of every core and every node in your system.

Interactive

By distributing your data across multiple nodes, Arkouda allows you to rapidly transform and wrangle datasets in real time that are simply intractable for a laptop or desktop.

Extensible

One can expand on Arkouda’s capabilities, thus enabling arbitrary scalable computations to be performed from Python.

Powered by Chapel

Arkouda’s backend is implemented in Chapel, an open-source parallel programming language. Chapel is unique among mainstream languages as it puts parallelism and locality in the forefront, while not sacrificing productivity or portability. Chapel enables Arkouda to perform well and scale on many different architectures, from multicore laptops to cloud systems to world’s fastest supercomputers.

To learn more about Chapel, check out its blog, presentations, tutorials and demos, and the How Can I Learn Chapel? page.

Arkouda users are saying…

…solving problems in a matter of seconds, as opposed to days…

[I’m] working with more data than I ever thought possible as a data scientist!

With Arkouda, you can…

Make the most of your hardware

Source: Arkouda argsort Benchmark

Hardware: HPE Cray EX with a Slingshot-11 network (200 Gb/s)

Do Exploratory Data Analysis (EDA) on large to massive datasets

No other data analysis tool can sort or group by massive data as effectively as Arkouda. For example, Arkouda has proven to be able to sort data at 8+ TB/s using 8k compute nodes. Whenever your dataset sizes exceed what you can fit on a single node, you’re likely to get benefit from Arkouda.

Write familiar Python code

Arkouda’s library functions deliberately mirror those of NumPy and Pandas, so you can get started with minimal learning curve.

# Generate two large arraysa = np.random.randint(0,2**32,2**28)b = np.random.randint(0,2**32,2**28)a = ak.random.randint(0,2**32,2**38)b = ak.random.randint(0,2**32,2**38)# add themc = a + b# Sort the array and print first 10 elementsc = np.sort(c)c = ak.sort(c)print(c[0:10])