INDEX
Explanations
verbs and terms indicating actions or processes
New Auto-Interp
Negative Logits
Bellev
-0.16
Huck
-0.15
ern
-0.14
PTS
-0.14
ÑģÑĤÑĢо
-0.13
ActionTypes
-0.13
лек
-0.13
çĩķ
-0.13
Nam
-0.13
ожеÑĤ
-0.13
POSITIVE LOGITS
oles
0.18
leys
0.16
éru
0.15
åĨĨ
0.15
gart
0.14
ogie
0.14
IDL
0.14
draulic
0.14
uhl
0.14
ellij
0.14
Activations Density 0.001%