INDEX
Explanations
locations and place names
the verb "be" in various forms and contexts
New Auto-Interp
Negative Logits
azines
-0.77
monop
-0.72
confir
-0.69
Leilan
-0.69
rador
-0.66
squeeze
-0.65
glide
-0.65
drift
-0.64
Ples
-0.63
raint
-0.62
POSITIVE LOGITS
yond
1.39
arers
1.14
arer
1.12
cker
1.05
zos
1.02
FORE
0.99
ards
0.95
ech
0.93
xt
0.93
gotten
0.92
Activations Density 0.028%