INDEX
Explanations
mentions of names of individuals, likely in a professional context
proper nouns and names of individuals or organizations
New Auto-Interp
Negative Logits
margins
-0.68
receptors
-0.68
%%
-0.61
stereotypes
-0.61
calendar
-0.61
marginal
-0.59
process
-0.59
Characters
-0.59
bucks
-0.59
ãĥ¥
-0.59
POSITIVE LOGITS
Jr
1.03
Sr
0.98
QC
0.86
meanwhile
0.86
aka
0.84
PhD
0.79
MD
0.78
chairman
0.76
Managing
0.76
however
0.76
Activations Density 0.177%