INDEX
Explanations
numerical values and fractions
percentages and numerical data related to demographics or statistical information
New Auto-Interp
Negative Logits
advertisement
-0.83
Claim
-0.81
Apps
-0.79
Projects
-0.76
Sites
-0.76
Products
-0.76
ALWAYS
-0.75
Sym
-0.75
ï¸
-0.72
Cause
-0.72
POSITIVE LOGITS
white
0.91
suburban
0.90
African
0.89
black
0.85
wom
0.84
blue
0.83
female
0.83
juvenile
0.83
Hispanic
0.82
gay
0.82
Activations Density 0.463%