INDEX
Explanations
phrases related to different roles or positions in organizations
phrases that indicate roles or positions held by individuals in various organizations
New Auto-Interp
Negative Logits
illin
-0.78
llah
-0.77
edu
-0.75
mort
-0.71
lass
-0.70
BR
-0.69
Reward
-0.69
root
-0.68
plate
-0.68
ouse
-0.68
POSITIVE LOGITS
bidden
0.91
ked
0.90
geries
0.87
gotten
0.84
example
0.83
instance
0.82
gery
0.81
cers
0.81
aging
0.80
Misc
0.79
Activations Density 0.109%