INDEX
Explanations
phrases indicating proximity or distance
New Auto-Interp
Negative Logits
isci
-0.19
inci
-0.16
onn
-0.15
embed
-0.15
361
-0.15
inea
-0.15
ÙĪØªÛĮ
-0.15
ongan
-0.14
iera
-0.14
обов
-0.14
POSITIVE LOGITS
ussy
0.17
erland
0.16
acker
0.16
Ih
0.15
.nd
0.14
nds
0.14
Morales
0.14
inç
0.14
Ain
0.14
CES
0.13
Activations Density 0.100%