INDEX
Explanations
verbs related to change or transition
New Auto-Interp
Negative Logits
oref
-0.16
18
-0.15
ools
-0.15
kek
-0.14
ừ
-0.14
enheim
-0.14
clide
-0.14
Åĵur
-0.14
isted
-0.14
jsx
-0.14
POSITIVE LOGITS
ÃŃa
0.37
án
0.35
á
0.35
ÃŃan
0.31
emos
0.25
iam
0.25
ÃŃ
0.24
ás
0.24
ÃŃas
0.23
ia
0.22
Activations Density 0.009%