INDEX
Explanations
connections represented by conjunctions in phrases
New Auto-Interp
Negative Logits
.getvalue
-0.15
urg
-0.15
isu
-0.14
893
-0.14
aggress
-0.14
rez
-0.14
opleft
-0.13
injunction
-0.13
ZD
-0.13
ilo
-0.13
POSITIVE LOGITS
/or
0.22
para
0.18
semi
0.17
rog
0.17
ific
0.17
Passwords
0.16
otherwise
0.16
rogen
0.15
para
0.15
BITS
0.14
Activations Density 0.175%