INDEX
Explanations
verbs indicating causation or driving force
phrases that indicate causation
New Auto-Interp
Negative Logits
raq
-0.80
naissance
-0.77
kered
-0.75
asy
-0.75
SPONSORED
-0.73
resil
-0.72
icket
-0.72
itled
-0.71
hement
-0.71
apest
-0.70
POSITIVE LOGITS
virtue
1.32
products
1.05
laws
0.92
fiat
0.89
multiplying
0.79
afar
0.77
means
0.75
leaps
0.74
default
0.71
STATS
0.70
Activations Density 0.272%