INDEX
Explanations
phrases related to causality or reasons
phrases indicating causation or relationships between concepts
New Auto-Interp
Negative Logits
ascus
-0.86
scribe
-0.78
ail
-0.77
mouth
-0.76
bage
-0.75
arez
-0.75
adesh
-0.72
umat
-0.72
rete
-0.72
soever
-0.72
POSITIVE LOGITS
fact
1.02
inexper
0.95
methodological
0.94
differences
0.90
sheer
0.90
lack
0.86
ignorance
0.85
misunderstanding
0.85
familiarity
0.84
inertia
0.84
Activations Density 0.236%