INDEX
Explanations
articles and determiners preceding nouns
New Auto-Interp
Negative Logits
notes
-0.81
coins
-0.76
encies
-0.76
Investigator
-0.74
ATURES
-0.73
Shape
-0.72
cair
-0.72
onge
-0.71
Acts
-0.71
examiner
-0.69
POSITIVE LOGITS
plethora
1.09
bye
1.06
league
1.03
matchup
1.03
terrific
1.02
slew
1.00
whopping
1.00
venge
1.00
rematch
1.00
rookie
0.99
Activations Density 0.188%