INDEX
Explanations
phrases related to sports teams
mentions of teams and groups
New Auto-Interp
Negative Logits
Theft
-0.64
aez
-0.62
imon
-0.62
distances
-0.62
silence
-0.60
dependence
-0.60
hazard
-0.59
uniquely
-0.58
Distribut
-0.58
threats
-0.58
POSITIVE LOGITS
guiActiveUnfocused
0.78
headed
0.74
overseeing
0.69
Chel
0.69
itol
0.68
alion
0.65
stationed
0.65
Olympia
0.64
Claude
0.64
tasked
0.64
Activations Density 0.331%