INDEX
Explanations
phrases highlighting conditions or stipulations
New Auto-Interp
Negative Logits
kees
-0.18
Loc
-0.16
ocop
-0.16
olars
-0.16
ASTE
-0.16
lew
-0.15
ahr
-0.14
AGMA
-0.14
olian
-0.14
Aster
-0.14
POSITIVE LOGITS
parator
0.16
(æĹ¥
0.15
ziej
0.15
rad
0.15
upert
0.15
ewood
0.14
imon
0.14
emap
0.14
aeper
0.14
uner
0.14
Activations Density 0.017%