INDEX
Explanations
place names and descriptions
New Auto-Interp
Negative Logits
ί
0.87
ado
0.83
deki
0.83
they
0.73
owy
0.71
ين
0.70
conseguiu
0.68
í
0.68
construye
0.68
ной
0.67
POSITIVE LOGITS
I
0.69
pch
0.64
an
0.63
velocities
0.63
ribbons
0.62
the
0.61
ING
0.60
increases
0.60
s
0.59
Inst
0.59
Activations Density 0.030%