INDEX
Explanations
phrases indicating proximity or distance
New Auto-Interp
Negative Logits
ATIC
-0.15
Highlands
-0.15
å©Ĩ
-0.14
wipe
-0.14
etr
-0.14
hats
-0.13
anton
-0.13
estic
-0.13
Hogan
-0.13
γÏī
-0.13
POSITIVE LOGITS
hea
0.16
away
0.16
yll
0.15
673
0.15
oard
0.15
672
0.15
ourt
0.15
beck
0.14
Yani
0.14
ÏĤ
0.14
Activations Density 0.114%