INDEX
Explanations
declarative and descriptive statements about experiences or conditions
New Auto-Interp
Negative Logits
acob
-0.17
Wag
-0.16
idine
-0.16
isclosed
-0.15
ysz
-0.15
cia
-0.15
iah
-0.15
AGIC
-0.14
allas
-0.14
Watkins
-0.14
POSITIVE LOGITS
oyo
0.15
ownership
0.15
oom
0.14
Harness
0.14
orida
0.14
otor
0.14
Minute
0.14
Ownership
0.13
ownership
0.13
ãĥ³ãĥĹ
0.13
Activations Density 0.063%