INDEX
Explanations
TV show titles containing the word "Bull"
New Auto-Interp
Negative Logits
ALLY
-0.74
ALS
-0.72
MENTS
-0.66
mble
-0.66
Genie
-0.65
MENT
-0.64
Archdemon
-0.63
SPONSORED
-0.62
bourg
-0.62
Norn
-0.61
POSITIVE LOGITS
dog
1.17
shit
1.10
ocks
1.07
ock
1.05
fighter
1.02
iard
1.01
fights
0.98
frog
0.98
oon
0.98
oons
0.95
Activations Density 0.019%