INDEX
Explanations
references to boys and girls in the text
New Auto-Interp
Negative Logits
BeginInit
-0.76
Genn
-0.75
>")
-0.74
"").
-0.74
\}}
-0.74
″]
-0.73
")}
-0.71
]").
-0.70
AndEndTag
-0.69
"=>"
-0.69
POSITIVE LOGITS
Boys
2.12
boys
2.12
boy
2.09
BOYS
2.06
BOY
2.00
Boy
2.00
Boys
1.98
Boy
1.96
boy
1.95
boys
1.92
Activations Density 0.034%