INDEX
Explanations
phrases related to emotions and feelings
New Auto-Interp
Negative Logits
gio
-0.15
aru
-0.15
ddl
-0.15
arda
-0.15
nostic
-0.15
AtPath
-0.14
ite
-0.14
str
-0.14
Dram
-0.14
nosis
-0.14
POSITIVE LOGITS
feelings
0.28
offended
0.27
offence
0.21
offend
0.21
hurt
0.20
sensitive
0.20
sensit
0.20
offense
0.20
Hurt
0.20
sensitivity
0.19
Activations Density 0.047%