INDEX
Explanations
words related to skepticism or critical viewpoints
New Auto-Interp
Negative Logits
ouse
-0.17
Fol
-0.17
698
-0.16
yun
-0.15
tn
-0.15
UGE
-0.15
rika
-0.14
ail
-0.14
adx
-0.14
mul
-0.14
POSITIVE LOGITS
ptic
0.29
letal
0.25
chers
0.20
Ske
0.20
icism
0.19
letic
0.18
pch
0.16
skept
0.16
-UA
0.16
scept
0.16
Activations Density 0.007%