INDEX
Explanations
words related to actions or processes that involve a level of deliberation or decision-making
words related to various forms of "actions" or "activities."
New Auto-Interp
Negative Logits
ALL
-0.63
Present
-0.62
URRENT
-0.61
ASED
-0.60
STAT
-0.60
âĹı
-0.60
ãĥ£
-0.60
avez
-0.59
CHAR
-0.58
AGES
-0.58
POSITIVE LOGITS
ations
1.24
hower
1.17
etting
0.97
etter
0.95
leeve
0.90
eus
0.89
arily
0.86
ulations
0.85
chwitz
0.85
hip
0.84
Activations Density 0.024%