INDEX
Explanations
the word "normal" or related terms
New Auto-Interp
Negative Logits
artisan
-0.77
hani
-0.70
better
-0.67
wark
-0.67
iosyncr
-0.63
raped
-0.63
Goff
-0.63
REL
-0.63
intel
-0.63
Sov
-0.62
POSITIVE LOGITS
cy
1.50
ization
1.38
izing
1.37
izes
1.33
ised
1.28
isation
1.28
ize
1.24
ized
1.20
izer
1.14
izers
1.13
Activations Density 0.034%