INDEX
Explanations
instances where someone is expressing an opinion or statement
New Auto-Interp
Negative Logits
=~=~
-0.77
ï¸ı
-0.69
ierrez
-0.69
ositories
-0.68
Redd
-0.67
utonium
-0.65
onut
-0.64
thia
-0.64
aint
-0.62
icut
-0.62
POSITIVE LOGITS
goodbye
1.25
bye
1.03
aloud
0.90
hello
0.84
farewell
0.75
\"
0.75
Goodbye
0.69
loudly
0.68
sorry
0.68
publicly
0.66
Activations Density 0.058%