INDEX
Explanations
phrases related to inclusivity or general statements about groups of people
references to groups of people or entities
New Auto-Interp
Negative Logits
itta
-0.72
odcast
-0.70
edia
-0.67
tch
-0.64
instein
-0.62
iceps
-0.61
addafi
-0.61
osate
-0.61
senal
-0.60
ello
-0.60
POSITIVE LOGITS
except
1.47
imaginable
1.07
alike
1.06
whatsoever
1.05
except
1.04
soever
0.99
irrespective
0.91
facets
0.88
regardless
0.79
thereafter
0.78
Activations Density 0.207%