INDEX
Explanations
mentions of the breed "pit bull"
mentions of "pit bulls."
New Auto-Interp
Negative Logits
Carbuncle
-0.68
Lauder
-0.66
Feinstein
-0.61
e
-0.59
Polo
-0.59
ë
-0.58
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.58
Pacific
-0.57
XD
-0.57
commons
-0.57
POSITIVE LOGITS
iful
1.51
cair
1.41
iless
1.40
ifully
1.37
bull
1.26
cher
1.23
chers
1.00
uit
0.99
iable
0.98
bulls
0.97
Activations Density 0.036%