INDEX
Explanations
the word "vert" or variations of it such as "vertex" or "avert" with varying degrees of activation
words related to subversion or undermining authority
New Auto-Interp
Negative Logits
llah
-0.82
çĦ
-0.76
utical
-0.75
Interstitial
-0.70
senal
-0.69
ptive
-0.68
Bei
-0.65
ptives
-0.65
Aw
-0.62
andro
-0.61
POSITIVE LOGITS
ibility
1.01
verted
0.99
version
0.91
ophobia
0.83
ibly
0.81
verts
0.80
ible
0.79
ibilities
0.76
verting
0.76
ected
0.74
Activations Density 0.031%