INDEX
Explanations
phrases expressing disbelief or skepticism
instances of disbelief or skepticism
New Auto-Interp
Negative Logits
itta
-0.68
catentry
-0.65
idth
-0.57
hift
-0.57
WAY
-0.56
oway
-0.56
arthed
-0.56
newcom
-0.55
stellar
-0.54
ãĤ¢
-0.54
POSITIVE LOGITS
chy
1.12
alian
1.12
happens
1.11
occurs
1.10
exists
1.08
hurts
1.08
's
1.05
belongs
1.03
unes
1.02
wasn
1.01
Activations Density 0.109%