INDEX
Explanations
verbs related to action and change
New Auto-Interp
Negative Logits
å°Ĩ
-0.18
å°Ĩ
-0.16
uen
-0.16
odel
-0.15
387
-0.15
å°ĩ
-0.15
ire
-0.15
antz
-0.15
ru
-0.15
ãĥ«ãĥĪ
-0.14
POSITIVE LOGITS
á
0.34
án
0.31
ÃŃa
0.30
Ãł
0.29
ÃŃan
0.22
anno
0.21
emos
0.21
ás
0.21
ait
0.20
ia
0.19
Activations Density 0.007%