INDEX
Explanations
results of actions or events and their consequences
sentence structures indicating causal relationships and outcomes
New Auto-Interp
Negative Logits
Truth
-0.86
obin
-0.85
venth
-0.81
TPP
-0.77
izont
-0.77
erest
-0.77
agate
-0.76
wcsstore
-0.75
Really
-0.75
ifest
-0.74
POSITIVE LOGITS
researchers
1.16
biologists
1.09
villagers
1.09
investigators
1.09
doctors
1.09
residents
1.08
inspectors
1.06
scientists
1.06
locals
1.06
surgeons
1.05
Activations Density 0.387%