INDEX
Explanations
themes related to danger, conflict, and existential threats
Things that are scary or threatening
hostile entities attacking
New Auto-Interp
Negative Logits
Sprintf
-0.53
پذیر
-0.50
AnchorStyles
-0.49
شهاد
-0.47
ệm
-0.46
ριν
-0.44
vermitteln
-0.43
verg
-0.43
rifty
-0.43
einger
-0.42
POSITIVE LOGITS
targeting
1.07
lurking
1.05
threatening
1.04
attacking
1.04
stalking
0.99
prow
0.94
wre
0.92
Targeting
0.92
menacing
0.91
threaten
0.89
Activations Density 0.424%