INDEX
Explanations
names and references to individuals and their familial connections
New Auto-Interp
Negative Logits
ellig
-0.15
fav
-0.15
aben
-0.15
.scalablytyped
-0.15
259
-0.14
.ru
-0.14
synonym
-0.13
RIX
-0.13
rob
-0.13
uw
-0.13
POSITIVE LOGITS
into
0.35
Into
0.31
Into
0.30
into
0.29
INTO
0.28
towards
0.26
naar
0.26
_into
0.24
.into
0.23
vÃło
0.22
Activations Density 0.052%