INDEX
Explanations
specific occupations or roles that people may have
specific roles or identities associated with individuals
New Auto-Interp
Negative Logits
izens
-0.72
architectures
-0.71
intervals
-0.70
ories
-0.69
Lans
-0.68
iPads
-0.67
Boards
-0.67
offenses
-0.66
rams
-0.66
shocks
-0.65
POSITIVE LOGITS
digy
0.92
myself
0.91
unto
0.84
alyst
0.84
iste
0.81
herself
0.81
ess
0.79
someday
0.78
himself
0.77
nik
0.75
Activations Density 0.266%