INDEX
Explanations
references to specific groups of people or individuals
references to different groups of people, particularly women and individuals with medical conditions
New Auto-Interp
Negative Logits
ãĥķãĤ©
-0.90
rawdownloadcloneembedreportprint
-0.81
Assembly
-0.80
Oracle
-0.78
Unity
-0.77
Capital
-0.76
Politics
-0.76
é»Ĵ
-0.75
Platform
-0.75
Shadow
-0.74
POSITIVE LOGITS
opausal
1.10
diagnosed
1.02
undergoing
1.00
ingested
0.98
aged
0.98
hormone
0.94
hospitalized
0.94
ingest
0.88
olesterol
0.87
afflicted
0.87
Activations Density 0.178%