INDEX
Explanations
questions starting with "Do"
questions posed to the reader
New Auto-Interp
Negative Logits
Reviewer
-0.83
)=(
-0.70
boarding
-0.70
workshop
-0.69
creen
-0.67
Software
-0.65
cream
-0.64
Azerb
-0.63
Handling
-0.63
EStreamFrame
-0.62
POSITIVE LOGITS
omsday
1.34
ppel
1.18
herty
1.12
zens
1.06
lez
1.05
lyak
0.98
oley
0.95
ozy
0.91
ggies
0.90
oway
0.90
Activations Density 0.081%