INDEX
Explanations
references to scales and measurements
New Auto-Interp
Negative Logits
zelf
-0.22
est
-0.17
assis
-0.17
ernals
-0.16
sell
-0.16
urer
-0.16
iates
-0.15
unker
-0.15
sWith
-0.15
ries
-0.15
POSITIVE LOGITS
-down
0.22
ToFit
0.20
-up
0.20
azy
0.17
out
0.17
ardy
0.17
-out
0.16
able
0.16
andin
0.15
à¤Ĥड
0.15
Activations Density 0.016%