INDEX
Explanations
statements of personal opinion or sentiment
New Auto-Interp
Negative Logits
ç·
-0.17
@student
-0.16
confess
-0.15
ONSE
-0.15
δά
-0.14
TestFixture
-0.14
-gun
-0.14
confessed
-0.14
bum
-0.14
quire
-0.14
POSITIVE LOGITS
wonder
0.25
Wonder
0.20
wonders
0.18
expected
0.17
chal
0.17
Wonder
0.16
Agree
0.16
lived
0.16
fail
0.16
sick
0.15
Activations Density 0.181%