INDEX
Explanations
themes related to imminent danger and risk factors
New Auto-Interp
Negative Logits
ekim
-0.14
beros
-0.13
532
-0.13
khúc
-0.13
_firestore
-0.13
Leaks
-0.13
ç¸
-0.13
iny
-0.13
559
-0.13
ickets
-0.13
POSITIVE LOGITS
danger
1.00
dangers
0.83
Danger
0.80
threat
0.78
risk
0.75
danger
0.75
dangerous
0.73
-danger
0.72
hazard
0.70
åį±
0.68
Activations Density 0.560%