INDEX
Explanations
adjectives expressing strong emotional reactions
expressions of strong emotional reactions or judgments
New Auto-Interp
Negative Logits
ailability
-0.74
roma
-0.72
downed
-0.72
rongh
-0.71
arta
-0.68
haar
-0.68
ournal
-0.66
igate
-0.66
obo
-0.66
carrier
-0.64
POSITIVE LOGITS
Dragonbound
0.83
Carlson
0.77
enough
0.76
Osw
0.70
imaru
0.70
Pwr
0.70
Beir
0.69
irony
0.66
Leaks
0.65
coincidence
0.64
Activations Density 0.182%