INDEX
Explanations
runner-ups or second-place finishes in competitions
New Auto-Interp
Negative Logits
olia
-0.79
rification
-0.74
ILCS
-0.70
Debor
-0.69
holdings
-0.68
ornia
-0.67
rox
-0.66
inian
-0.66
orescent
-0.65
illery
-0.64
POSITIVE LOGITS
ners
1.02
swick
0.99
gaard
0.94
wear
0.85
ways
0.84
nings
0.84
runner
0.81
bum
0.79
aways
0.77
stitch
0.76
Activations Density 0.012%