INDEX
Explanations
verbs related to conditions or states of being that indicate influence or dependency
New Auto-Interp
Negative Logits
licit
-0.15
cher
-0.15
/
-0.15
fraught
-0.14
toler
-0.14
neau
-0.14
ead
-0.14
for
-0.14
alone
-0.14
earch
-0.14
POSITIVE LOGITS
influenced
0.35
dependent
0.34
dependent
0.30
depend
0.29
affected
0.29
affected
0.29
determined
0.28
driven
0.27
depends
0.24
depends
0.24
Activations Density 0.171%