All Features

Everything Cleora Can Do

A comprehensive overview of every capability packed into a single 5 MB package — no GPU, no heavy dependencies.

Embedding Engine

8 algorithms unified under one API — spectral, walk-based, and matrix factorization methods

Cleora
ProNE
RandNE
DeepWalk
Node2Vec
HOPE
NetMF
GraRep
Variants: multiscale · attention · directed · weighted · streaming · inductive · edge features · node features

Rust Performance Core

Sparse matrix operations in Rust with PyO3 bindings. Adaptive parallelism across all CPU cores. 240x faster than GraphSAGE on large graphs.

240x faster than GraphSAGE
5 MB total footprint
0 heavy dependencies

Heterogeneous & Hypergraphs

Multi-type nodes and edges with per-relation embeddings. HeteroGraph class, metapath-based embedding, and homogeneous export for real-world data that doesn't fit simple graphs.

HeteroGraph
Multi-type nodes
Multi-type edges
Metapath embedding
Homogeneous export

Classification

Node classification without PyTorch or TensorFlow. Pure numpy/scipy implementations that run anywhere.

MLP Classifier
Label Propagation

Community Detection

Discover clusters and communities in your graph using multiple algorithms with modularity scoring.

Louvain
K-means clustering
Spectral clustering
Modularity scoring

Graph Statistics

Compute structural properties and centrality measures directly from your graph.

PageRank
Betweenness centrality
Clustering coefficient
Degree distribution
Diameter
Connected components

Preprocessing

Clean and prepare your graph data before embedding.

clean_graph
filter_by_degree
largest_connected_component

Similarity Search

Find similar entities using brute-force or approximate nearest neighbors. Predict missing links in your graph.

Brute-force KNN
ANNIndex (Faiss)
Link prediction
Cosine similarity

Embedding Compression

Reduce embedding dimensionality and memory footprint without losing signal.

PCA
Random projection
Product quantization

Alignment & Ensemble

Align embedding spaces and combine multiple embeddings for stronger representations.

Procrustes alignment
CCA alignment
Combine (mean / max / concat / weighted)

Evaluation & Metrics

Comprehensive evaluation without leaving the library. Measure embedding quality across multiple tasks.

AUC
MRR
Hits@K
MAP@K
nDCG
Cross-validation
Node classification scores
Clustering scores (ARI, Silhouette)

Graph Sampling

Sample subgraphs for scalable training and evaluation on large graphs.

Neighborhood sampling
Subgraph sampling
GraphSAINT
Negative sampling
Train / test split

I/O & Interop

Import from and export to popular data science formats. Seamless integration with your existing stack.

Pandas
NumPy
SciPy
Edge list
NetworkX
Export to PyG
Export to DGL
Save / load embeddings

Visualization

Reduce dimensions and plot embeddings for exploration and debugging.

reduce_dimensions (PCA, t-SNE, UMAP)
plot_embeddings

Datasets

14+ built-in datasets for benchmarking and experimentation — ready to use with a single call.

Facebook
LiveJournal
Cora
CiteSeer
PPI
Amazon
Reddit
roadNet
+ more

Synthetic Generators

Generate synthetic graphs with known properties for testing and experimentation.

Erdos-Renyi
Barabasi-Albert
Stochastic Block Model
Planted partition
Watts-Strogatz

Hyperparameter Tuning

Find the optimal embedding configuration with automatic evaluation across all algorithms.

Grid search
Random search

Benchmarking

Compare algorithms across datasets with time, memory, and accuracy metrics. Publication-ready results.

benchmark_algorithms
benchmark_datasets

CLI

Embed graphs directly from the command line — perfect for scripting and CI/CD pipelines.

pycleora embed
pycleora info
pycleora benchmark
pycleora similar

Scikit-learn Compatible

CleoraEmbedder implements the scikit-learn estimator API — fit(), transform(), fit_transform(). Works with sklearn pipelines and grid search.

Deterministic & Reproducible

Identical results on every run. Seeded initialization guarantees reproducibility across platforms and runs — critical for research and production.

Tiny Footprint

Just 5 MB installed — only numpy and scipy required. Compare to 500 MB+ for PyTorch Geometric or DGL. Installs in seconds, not minutes.

5 MB vs 500 MB+
0 heavy dependencies
Installs in seconds

Supervised Fine-tuning

Refine embeddings with labeled positive/negative pairs using margin loss. Adapt pre-trained embeddings to your specific task without retraining from scratch.