INDEX
Explanations
verbs related to observing or monitoring something
inquiries about future outcomes or scenarios involving comparisons and evaluations
New Auto-Interp
Negative Logits
ndra
-0.68
pioneered
-0.66
innocence
-0.64
invented
-0.64
otin
-0.60
rir
-0.60
udicrous
-0.57
ools
-0.57
pity
-0.55
saf
-0.54
POSITIVE LOGITS
versus
1.14
compared
1.13
depends
1.07
relative
1.01
varies
0.96
differs
0.94
depending
0.93
vis
0.93
differently
0.88
differ
0.85
Activations Density 0.376%