INDEX
Explanations
declarative statements leading to some sort of conclusion
statements that assert knowledge or understanding about a topic
New Auto-Interp
Negative Logits
concess
-0.61
+.
-0.60
hemor
-0.59
frequ
-0.59
sweats
-0.57
heels
-0.56
vain
-0.56
downed
-0.54
futile
-0.54
classics
-0.53
POSITIVE LOGITS
varies
0.89
differs
0.85
depends
0.84
is
0.83
illustrates
0.81
involves
0.77
boils
0.75
SPONSORED
0.73
fold
0.71
differed
0.71
Activations Density 0.169%