INDEX
Explanations
mentions of a specific sports team, "Dolphins"
mentions of the Dolphins, indicating a strong focus on the football team
New Auto-Interp
Negative Logits
dy
-0.83
sie
-0.73
ablishment
-0.73
lying
-0.72
Gree
-0.71
orate
-0.70
tick
-0.65
lest
-0.65
Bir
-0.65
ground
-0.64
POSITIVE LOGITS
Dolphins
1.39
olphins
1.13
dolphins
0.89
Dolphin
0.87
Swim
0.81
Marlins
0.79
dolphin
0.77
turtles
0.75
Gardens
0.74
backer
0.73
Activations Density 0.007%