INDEX
Explanations
intense emotional states, particularly those expressed with the word "deeply."
New Auto-Interp
Negative Logits
orio
-0.15
æľŁ
-0.15
bara
-0.14
åĩ
-0.14
avra
-0.14
erce
-0.14
antry
-0.14
411
-0.14
dik
-0.13
imoto
-0.13
POSITIVE LOGITS
thest
0.15
stein
0.15
Maduro
0.15
ohn
0.14
Bernie
0.14
atten
0.14
çľł
0.14
est
0.14
ening
0.14
ãĥ¼ãĥ³
0.14
Activations Density 0.003%