INDEX
Explanations
the word "for" in various contexts
New Auto-Interp
Negative Logits
kara
-0.15
.rd
-0.15
enia
-0.15
zelf
-0.14
kop
-0.14
Bs
-0.14
огÑĢа
-0.14
optera
-0.14
mary
-0.14
aven
-0.14
POSITIVE LOGITS
bidden
0.21
aging
0.18
rtl
0.17
mentor
0.17
aged
0.17
ump
0.16
geries
0.16
bear
0.16
yth
0.16
ney
0.15
Activations Density 0.095%