INDEX
Explanations
verbs and other action-oriented words emphasizing completion or significance
New Auto-Interp
Negative Logits
Cro
-0.18
eya
-0.18
Cro
-0.18
áze
-0.17
cro
-0.16
iÃŃ
-0.15
ergus
-0.15
buch
-0.15
ÙĪØµ
-0.15
_counters
-0.15
POSITIVE LOGITS
rel
0.15
-pic
0.14
ena
0.14
à¥įरथ
0.14
Peters
0.14
elli
0.14
-rel
0.14
Pic
0.14
loc
0.14
tle
0.14
Activations Density 0.003%