INDEX
Explanations
words related to predatory behavior or potential dangers
references to predators and predatory behavior
New Auto-Interp
Negative Logits
provisional
-0.79
oard
-0.75
bel
-0.75
gran
-0.74
VK
-0.72
rique
-0.70
ahon
-0.69
UGE
-0.68
rief
-0.68
printed
-0.68
POSITIVE LOGITS
predators
1.29
prey
1.13
predator
1.07
Predators
0.85
ervative
0.84
stalking
0.84
predatory
0.80
instincts
0.79
ervatives
0.78
carniv
0.78
Activations Density 0.018%