200x faster than DeepWalk,
4x-8x faster than Pytorch-BigGraph by Facebook

Cleora computes embeddings of your relational data. Entities such as clients, products, stores, accounts, and others can be represented with embeddings, just like Word2Vec or BERT for text or CLIP for images. Cleora embeddings are behavioral - they represent entities by their behavior history, which has the form of large graphs.

What can you build with Cleora Embeddings?

Self-service Cleora 2.0 is now available for everyone

Cleora Open Source is publicly available on Github and used by many industry leaders.

Key improvements in Cleora 2.0 over the open source 1.0 version:

automatic scaling: no expensive hardware required

ease of use: only 3 columns extracted from your DB are required. Graphs are detected automatically in the data

performance optimizations: 10x faster embedding times

latest research: significantly improved embedding quality

new feature: item attributes are supported

Visit GitHub repository

We used Cleora for customer-restaurants graph data in the National Capital Region (NCR) area. And to our delight, the embedding generation was superfast (i.e <5 minutes). For context, do remember that GraphSAGE took ~20hours for the same data in the NCR region.

Science

Lab

Sair is a lab focused on behavioral modeling, recommendations, large-scale data and graphs processing. We share our ideas, models, and experimental results, also presenting our take on important breakthroughs and interesting technologies. We hope to build a better and more thorough understanding of the field. We believe in the importance of this research not only from a business perspective but most importantly as a study of human decision-making processes.

Research

8 min read

BaseModel vs TIGER for sequential recommendations

The comparison between BaseModel and TIGER reveals substantial differences in their architectural choices and performance.

Read post

Research

8 min read

BaseModel vs HSTU for sequential recommendations

To evaluate BaseModel against HSTU, we replicated the exact data preparation, training, validation, and testing protocols described in the HSTU paper.

Read post

Research

8 min read

Fourier Feature Encoding of numerical features

Pre-processing raw input data is a very important part of any machine learning pipeline, often crucial for end model performance

Read post

Future

6 min read

Why We Need Inhuman Artificial Intelligence

We continuously wonder how much longer it will take until AI reaches human skill level in these tasks - or, when does AI become "truly" intelligent.

Read post

Engineering

12 min read

EMDE vs Multiresolution Hash Encoding

When we created our EMDE algorithm we primarily had in mind the domain of behavioral profiling.

Read post

Tools

8 min read

Efficient integer pair hashing

Mental models are simple expressions of complex processes or relationships.

Read post

Research

9 min read

Cleora: how we handle billion-scale graph data

We have recently open sourced Cleora — an ultra fast vertex embedding tool for graphs & hypergraphs.

Read post

Research

8 min read

Towards a multi-purpose behavioral model

In various subfields of AI research, there is a tendency to create models which can serve many different tasks with minimal fine-tuning effort.

Read post

Research

10 min read

EMDE Illustrated

In this article we provide some intuitive explanations of our objectives and theoretical background of the Efficient Manifold Density Estimator (EMDE)

Read post

Research

7 min read

How we challenge the Transformer

Having achieved remarkable successes in natural language and image processing, Transformers have finally found their way into the area of recommendation.

Read post

Create Embeddings
with 1 Click

A machine learning tool that enables faster and hyper-easy production of graph embeddings for big graphs

200x faster than DeepWalk,
4x-8x faster than Pytorch-BigGraph by Facebook

What can you build with Cleora Embeddings?

Self-service Cleora 2.0 is now available for everyone

Embedding quality

Embedding speed

Key technical features of Cleora embeddings

Efficiency

Inductivity

Cross-dataset compositionality

Stability

Extreme parallelism and performance

Dim-wise independence

We used Cleora for customer-restaurants graph data in the National Capital Region (NCR) area. And to our delight, the embedding generation was superfast (i.e <5 minutes). For context, do remember that GraphSAGE took ~20hours for the same data in the NCR region.

Lab

BaseModel vs TIGER for sequential recommendations

BaseModel vs HSTU for sequential recommendations

Fourier Feature Encoding of numerical features