INDEX
Explanations
references to the word "shark"
mentions of sharks
New Auto-Interp
Negative Logits
Dayton
-0.78
orter
-0.74
cu
-0.72
Whe
-0.67
Eisenhower
-0.66
Ohio
-0.65
Davis
-0.64
COL
-0.64
Cel
-0.64
Advent
-0.64
POSITIVE LOGITS
shark
3.74
sharks
3.52
Shark
3.29
Sharks
2.60
whale
1.76
whales
1.64
Whale
1.60
fins
1.59
squid
1.52
shrimp
1.45
Activations Density 0.019%