INDEX
Explanations
strong emotions or opinions expressed in text
negative actions or statements related to inability or failure
New Auto-Interp
Negative Logits
Balance
-0.75
recip
-0.68
Tokens
-0.67
ueless
-0.67
each
-0.66
Higher
-0.66
Compared
-0.65
balance
-0.64
Someone
-0.64
SpaceEngineers
-0.64
POSITIVE LOGITS
ashtra
0.73
meanwhile
0.69
notations
0.68
Bris
0.65
Slate
0.64
Zub
0.63
Tsarnaev
0.62
Phant
0.62
Eliot
0.61
Sut
0.59
Activations Density 0.547%