INDEX
Explanations
phrases related to stories claimed to be true or having a sense of authenticity
references to "true" narratives or stories
New Auto-Interp
Negative Logits
acco
-0.83
ocene
-0.76
Corp
-0.76
served
-0.74
Pages
-0.73
ONES
-0.73
uled
-0.73
artment
-0.71
adish
-0.71
azine
-0.68
POSITIVE LOGITS
believer
1.00
believers
0.97
freshman
0.77
stic
0.77
positives
0.74
terday
0.71
ignment
0.69
patriot
0.69
çĭ
0.68
emancipation
0.67
Activations Density 0.022%