INDEX
Explanations
religious terms, political entities, and social issues
references to groups or identities related to social issues and injustices
New Auto-Interp
Negative Logits
Redditor
-0.74
çͰ
-0.70
Usage
-0.62
awa
-0.62
Logged
-0.61
ISION
-0.61
Symptoms
-0.61
davidjl
-0.61
Compat
-0.60
Niet
-0.60
POSITIVE LOGITS
town
0.77
conservancy
0.65
wealth
0.63
arton
0.61
hops
0.61
gov
0.61
clubs
0.60
chool
0.59
alf
0.59
secondary
0.58
Activations Density 0.771%