INDEX
Explanations
expressions of agreement or affirmation
New Auto-Interp
Negative Logits
bound
-0.71
animate
-0.71
Bound
-0.65
tis
-0.63
Ascension
-0.63
adden
-0.61
!/
-0.60
approximation
-0.59
Binding
-0.59
endi
-0.59
POSITIVE LOGITS
cautioned
1.08
noted
1.06
criticized
1.05
thanked
1.03
criticised
1.01
wondered
1.01
highlighted
1.01
addressed
1.00
emphasized
0.99
praised
0.99
Activations Density 0.063%