INDEX
Explanations
words that express worry or anxiety about various issues
New Auto-Interp
Negative Logits
ynh
-0.69
ituri
-0.65
G
-0.64
-0.64
Ony
-0.63
ة
-0.62
t
-0.62
way
-0.61
avy
-0.61
städ
-0.61
POSITIVE LOGITS
concerns
1.85
concern
1.81
Concern
1.79
Concerns
1.68
concerns
1.67
concern
1.66
CONCERN
1.59
Concern
1.53
CONCER
1.43
Concerns
1.42
Activations Density 0.046%