INDEX
Explanations
words related to being worried, troubled, or having a strong interest in something
expressions of concern or worry
New Auto-Interp
Negative Logits
arb
-0.78
iller
-0.72
aro
-0.70
orge
-0.70
avour
-0.67
hement
-0.65
ples
-0.64
exc
-0.61
urus
-0.61
avorite
-0.61
POSITIVE LOGITS
about
1.34
ABOUT
1.15
about
0.99
lest
0.97
About
0.96
lessly
0.96
ingly
0.95
aloud
0.86
regarding
0.82
About
0.81
Activations Density 0.062%