INDEX
Explanations
instances of the phrase "too often" in the text
New Auto-Interp
Negative Logits
bol
-0.72
Sever
-0.64
uddin
-0.64
orah
-0.62
TNT
-0.61
ebus
-0.60
Panzer
-0.60
stroke
-0.59
elta
-0.59
orthy
-0.59
POSITIVE LOGITS
misconceptions
1.05
outdated
1.00
stereotypes
0.97
societal
0.96
stigma
0.91
industrialized
0.90
stereotype
0.88
ignorance
0.88
society
0.88
misinformation
0.87
Activations Density 0.817%