INDEX
Explanations
words related to actions or behaviors
words related to act and action-oriented concepts
New Auto-Interp
Negative Logits
tradem
-0.83
yip
-0.82
Ń·
-0.81
whisk
-0.78
ĨĴ
-0.75
tremend
-0.74
psey
-0.72
uyomi
-0.71
awaru
-0.69
flared
-0.69
POSITIVE LOGITS
act
1.16
uated
1.02
uary
0.96
ional
0.93
itol
0.86
ivism
0.86
Pub
0.85
ivity
0.83
uate
0.82
asia
0.79
Activations Density 0.009%