INDEX
Explanations
references to "Hasbro" in the text
references to the brand "Hasbro."
New Auto-Interp
Negative Logits
Gins
-0.68
orious
-0.66
Prim
-0.64
Noon
-0.61
Dull
-0.60
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
-0.60
Nir
-0.59
famed
-0.58
aries
-0.58
outfield
-0.58
POSITIVE LOGITS
ccoli
1.20
bro
1.11
idered
1.00
keye
0.94
afort
0.92
vity
0.91
kes
0.91
cffff
0.90
kef
0.90
oke
0.90
Activations Density 0.010%