INDEX
Explanations
repeated instances of the word "as" indicating comparisons or analogies
New Auto-Interp
Negative Logits
Parms
-0.16
vous
-0.15
èį·
-0.15
riers
-0.14
mant
-0.14
anner
-0.14
ovit
-0.14
ç´ł
-0.14
kaz
-0.13
rape
-0.13
POSITIVE LOGITS
fat
0.16
ocz
0.14
yw
0.14
arr
0.14
resh
0.13
Yar
0.13
bsolute
0.13
ersions
0.13
æ¡
0.13
err
0.13
Activations Density 0.082%