INDEX
Explanations
information related to professions or job titles
phrases indicating job titles and roles
New Auto-Interp
Negative Logits
encies
-0.84
alties
-0.82
events
-0.79
tons
-0.79
mares
-0.77
uden
-0.71
ulence
-0.69
ayers
-0.69
agree
-0.69
ency
-0.68
POSITIVE LOGITS
member
1.03
teenager
1.02
prisoner
0.95
liaison
0.93
waitress
0.92
surrogate
0.91
trustee
0.91
teacher
0.90
substitute
0.90
guest
0.90
Activations Density 0.116%