INDEX
Explanations
mentions of academic or professional titles, specifically "associate professor."
New Auto-Interp
Negative Logits
imon
-0.80
cale
-0.77
ODUCT
-0.74
plays
-0.72
CRIP
-0.72
aneers
-0.72
acing
-0.72
akens
-0.72
=-=-=-=-=-=-=-=-
-0.71
itans
-0.71
POSITIVE LOGITS
dean
1.00
professor
0.87
newsp
0.83
lies
0.80
lly
0.79
rete
0.72
ially
0.69
rium
0.68
ously
0.68
colonel
0.67
Activations Density 0.038%