INDEX
Explanations
prepositions indicating causes or reasons
New Auto-Interp
Negative Logits
baum
-0.16
urtle
-0.16
ustum
-0.15
eum
-0.15
mps
-0.15
furt
-0.14
Wikispecies
-0.14
rá»Ŀi
-0.14
(#
-0.14
eldon
-0.14
POSITIVE LOGITS
lie
0.17
fi
0.16
ò
0.16
sake
0.16
reason
0.15
way
0.15
iar
0.15
means
0.15
centaje
0.15
æĸ¼
0.15
Activations Density 0.023%