INDEX
Explanations
words related to dogs, specifically pit bulls
references to pit bulls
New Auto-Interp
Negative Logits
Carbuncle
-0.71
Lauder
-0.69
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.68
Polo
-0.67
IGH
-0.62
Leilan
-0.61
hig
-0.59
edly
-0.59
e
-0.58
XD
-0.58
POSITIVE LOGITS
cair
1.41
iful
1.35
ifully
1.27
iless
1.26
cher
1.17
bull
1.03
falls
0.93
uit
0.93
ney
0.91
adium
0.91
Activations Density 0.037%