INDEX
Explanations
asterisks (*)
occurrences of the asterisk symbol
New Auto-Interp
Negative Logits
unification
-0.68
seiz
-0.66
è¡
-0.65
manif
-0.64
shaping
-0.64
travers
-0.64
inator
-0.64
migr
-0.63
ivated
-0.63
etheless
-0.62
POSITIVE LOGITS
AUT
0.88
Names
0.83
Required
0.81
NEW
0.80
TON
0.79
Correction
0.78
CM
0.77
NB
0.77
Thompson
0.76
NOTE
0.73
Activations Density 0.028%