INDEX
Explanations
interrogative and comparative phrases
New Auto-Interp
Negative Logits
iaux
-0.08
lify
-0.08
itionally
-0.08
%[
-0.08
isinden
-0.07
ricks
-0.07
icamente
-0.07
.esp
-0.07
zsche
-0.07
apus
-0.07
POSITIVE LOGITS
/or
0.07
ness
0.07
ion
0.07
-to
0.07
sembl
0.07
-than
0.07
ocratic
0.07
rog
0.07
s
0.06
ment
0.06
Activations Density 0.074%