INDEX
Explanations
verbs indicating attempts or actions toward a goal
New Auto-Interp
Negative Logits
itize
-0.19
iated
-0.18
italize
-0.17
IZED
-0.16
ISED
-0.16
ILER
-0.16
hausen
-0.16
pone
-0.16
urator
-0.16
ekler
-0.15
POSITIVE LOGITS
ings
0.71
ing
0.63
ng
0.62
Ing
0.61
ÂŃing
0.58
INGS
0.51
-ing
0.49
ning
0.47
Ing
0.45
ining
0.43
Activations Density 0.051%