INDEX
Explanations
words related to ideologies and beliefs
references to ideologies and related concepts
New Auto-Interp
Negative Logits
FACE
-0.91
SEAL
-0.76
lain
-0.70
ARDS
-0.70
smith
-0.70
bucks
-0.66
Sands
-0.66
ILY
-0.66
disinfect
-0.65
Mercy
-0.65
POSITIVE LOGITS
ologue
1.02
ī
0.98
ide
0.90
rix
0.90
mble
0.90
theoret
0.89
ality
0.87
ogly
0.85
iple
0.83
atorial
0.83
Activations Density 0.014%