INDEX
Explanations
words related to worry or concern
instances of concern or anxiety related to various subjects
New Auto-Interp
Negative Logits
ewitness
-0.85
inters
-0.76
ractions
-0.72
artifacts
-0.72
dating
-0.70
adr
-0.69
ingers
-0.67
avour
-0.66
guided
-0.65
authorized
-0.64
POSITIVE LOGITS
warts
0.93
worried
0.79
worry
0.79
about
0.78
wart
0.76
vier
0.75
bells
0.74
ABOUT
0.74
Pu
0.73
schizophren
0.73
Activations Density 0.031%