INDEX
Explanations
negations and phrases indicating lack of necessity or worry
New Auto-Interp
Negative Logits
assis
-0.16
oda
-0.16
gorithm
-0.15
vant
-0.14
ippet
-0.14
somehow
-0.14
onest
-0.14
accordingly
-0.14
serious
-0.14
opi
-0.13
POSITIVE LOGITS
worry
0.46
fret
0.35
worrying
0.35
concern
0.34
bother
0.34
worries
0.30
worried
0.28
waste
0.28
stress
0.28
preocup
0.26
Activations Density 0.154%