INDEX
Explanations
mentions of dangerous or harmful situations
instances of the word "deadly" in relation to various harmful events or conditions
New Auto-Interp
Negative Logits
erous
-0.87
agate
-0.87
ourced
-0.84
ership
-0.80
sama
-0.78
arity
-0.77
adra
-0.74
Catalog
-0.74
estamp
-0.74
via
-0.74
POSITIVE LOGITS
poisonous
0.97
deadly
0.95
wounding
0.87
lethal
0.86
poison
0.82
assault
0.82
violence
0.78
threats
0.78
ãĥ£
0.78
fatal
0.77
Activations Density 0.011%