INDEX
Explanations
phrases related to the concept of correctness, specifically political correctness
references to political correctness
New Auto-Interp
Negative Logits
Organ
-0.77
Rider
-0.70
joining
-0.67
Wings
-0.67
ogo
-0.66
paying
-0.65
Scher
-0.64
Revenue
-0.64
participating
-0.63
Bus
-0.62
POSITIVE LOGITS
correctness
1.38
orthodoxy
0.99
guiActiveUn
0.95
behavi
0.91
prejudice
0.86
geist
0.85
ignorance
0.85
icity
0.84
iveness
0.81
iquette
0.80
Activations Density 0.014%