INDEX
Explanations
verbs related to change or transformation
New Auto-Interp
Negative Logits
onto
-0.30
onto
-0.28
Ont
-0.21
Ont
-0.19
uci
-0.17
inski
-0.17
inks
-0.17
insky
-0.17
à¹Ģà¸Ĥ
-0.16
inki
-0.16
POSITIVE LOGITS
menjadi
0.27
thÃłnh
0.25
become
0.23
becomes
0.22
Bec
0.20
becoming
0.20
into
0.20
Become
0.20
bec
0.19
became
0.19
Activations Density 0.050%