INDEX
Explanations
references to snakes or snake-related terms
mentions of snakes
New Auto-Interp
Negative Logits
estine
-0.87
vre
-0.80
encer
-0.76
amily
-0.73
ICAN
-0.70
ufact
-0.67
sts
-0.66
eatures
-0.65
VB
-0.63
mable
-0.63
POSITIVE LOGITS
bite
1.24
snakes
1.11
snake
0.98
venom
0.91
Snake
0.90
reptiles
0.90
turtles
0.89
guards
0.88
zilla
0.84
oche
0.80
Activations Density 0.015%