If you’re new to hyperbolic geometry, we recommend that you read our Simple Geometry Initiation.

This blogpost presents our work Hyperbolic Neural Networks (arxiv paper, code, poster, video), accepted to NIPS’18 with a spotlight presentation.

The aim of this work is to make deep learning feasible in hyperbolic space, more specifically in the Poincaré ball.

In particular, we’ll explain how to :

- Perform matrix-vector multiplications, apply non-linearities
- Adapt certain neural network architectures (basic RNN, GRU, …)
- Adapt softmax/multinomial logistic regression

### Starter: basic definitions and Möbius operations

Remember that in the Poincaré ball model of hyperbolic space, the set we consider is the open ball

and we measure distances between two points via

where is the even part of and is called the *conformal factor.*

With this definition in hand, you may wonder: *can we define basic operations, such as adding two elements or rescaling one by a real number?*

These have indeed already been defined, and are called *Möbius operations.*

The Möbius addition of to is defined by

Although this operation is not commutative nor associative, it satisfies some basic desirable

The Möbius scalar multiplication of by is defined by

Likewise, this operation satisfies a few

Another reason why these definitions make sense, is because they provide simple formulas for

### Matrix-vector multiplication in hyperbolic space

Our first contribution results in noticing that we can re-write the mysterious formula of Möbius scalar multiplication in a more intuitive manner, using two well-known tools in differential geometry called the

We get the following formula:

.

This means that the Möbius scalar multiplication of by corresponds to

- Moving to the tangent space at 0 of the Poincaré ball using ,
- Multiplying this vector by , since is now in the vector space ,
- Projecting it back on the manifold using the exp map at 0.

This identity can be easily verified by replacing the exp and log maps by their

We propose to define matrix-vector multiplications in the Poincaré ball in a similar manner:

.

It is then easily checked that this operation still satisfies the usual desirables

### Applying a non-linearity

The previous definitions naturally suggest to apply non-linearities in same manner:

- Moving to the tangent space at 0 of the Poincaré ball using ,
- Applying the non-linearity to , since it is now in the vector space ,
- Projecting it back on the manifold using the exp map at 0.

We get

.

Note that since addition and multiplication are already non-linear, you might not need any additional non-linearity. Give it a try. In our experiments, we obtained better results by removing it.

### Bias translation

How should we “translate” a point by another one ?

Since we already have a notion of addition, we can simply take

However, if you’re already familiar with basic differential geometry, you could come up with

As they are equivalent, simply use as your bias translation.

### RNN, GRU

It seems like we have all we need to define a simple RNN!

A naive RNN is usually defined with some pointwise non-linearity by the following update equation:

,

where are matrices and is a bias.

A natural adaptation using our formulas yields:

.

Now can we also adapt the GRU architecture?

Yes! Feel free to compare the formulas of the Euclidean and hyperbolic GRUs:

Remember that Euclidean space has zero curvature and the Poincaré ball has curvature ? Well, in our paper, we consider the general case of curvature for some , and we show that continuously moving from to , our general h-GRU equations give back the standard GRU equations 🙂

In order to assess whether these models can actually be trained, we ran some simple

On these tasks, we found that hyperbolic RNN either outperform or are on par with their Euclidean counterparts. However, more empirical investigation is needed.

### Hyperbolic softmax

In order to generalize multinomial logistic regression (MLR, also called softmax regression), we proceed in 3 steps:

- We first reformulate softmax probabilities with a distance to a margin hyperplane
- Second, we define hyperbolic hyperplanes
- Finally, by finding the closed-form formula of the distance between a point and a hyperbolic hyperplane, we derive the final formula.

Similarly as for the hyperbolic GRU, we show that when the Poincaré ball in continuously flattened to Euclidean space (sending its radius to infinity), hyperbolic softmax converges to Euclidean softmax.

The hyperbolic hyperplane centered at , with normal direction is given by

See Figure 2 for an illustration.

Have a look at our final closed formula for the

Figure 3 illustrates a comparison between 2-class classification obtained respectively with hyperbolic and Euclidean softmax.

In order to assess whether our hyperbolic adaptation of MLR would yield any practical improvements, we ran a few

## 7 thoughts on “Hyperbolic Neural Networks”