INDEX
Explanations
verbs or nouns related to actions, behavior, or events
New Auto-Interp
Negative Logits
ixed
-0.74
Lank
-0.71
fer
-0.68
ļéĨĴ
-0.67
bid
-0.67
skinned
-0.67
ickets
-0.66
arton
-0.64
lined
-0.63
moon
-0.62
POSITIVE LOGITS
uated
1.17
uate
1.15
ives
1.12
uary
1.05
uating
1.05
ional
1.02
ual
1.00
uations
0.99
ivity
0.97
ivism
0.97
Activations Density 2.097%