INDEX
Explanations
words related to worries or anxieties
phrases expressing concern or worry
New Auto-Interp
Negative Logits
avorite
-0.78
iller
-0.75
oba
-0.71
avour
-0.66
OVA
-0.66
ingers
-0.62
alter
-0.59
Bom
-0.59
egal
-0.59
aro
-0.59
POSITIVE LOGITS
ingly
0.99
about
0.96
lessly
0.91
lest
0.84
trolling
0.84
ABOUT
0.80
wart
0.78
edly
0.78
bells
0.77
ativity
0.75
Activations Density 0.046%