INDEX
Explanations
phrases indicating learning preferences and educational contexts
New Auto-Interp
Negative Logits
esser
-0.19
braco
-0.18
.GraphicsUnit
-0.18
uros
-0.17
imary
-0.17
LEAR
-0.17
viar
-0.15
antro
-0.15
gressor
-0.15
peq
-0.15
POSITIVE LOGITS
stat
0.19
↵
0.16
Stat
0.16
etc
0.15
whereas
0.15
/stat
0.15
bert
0.14
Carey
0.14
933
0.14
jay
0.14
Activations Density 0.079%