INDEX
Explanations
references to spatial contexts or locations within a text
New Auto-Interp
Negative Logits
ses
-0.19
anna
-0.15
chip
-0.15
éri
-0.14
iller
-0.14
yat
-0.14
alties
-0.14
aren
-0.14
živ
-0.14
ald
-0.14
POSITIVE LOGITS
most
0.16
regard
0.16
lac
0.15
786
0.15
/out
0.15
ë¶Ģ
0.14
creasing
0.14
Ùħار
0.14
spirit
0.14
Ñģобой
0.13
Activations Density 0.050%