INDEX
Explanations
instances where someone holds a specific position or duty
phrases that describe roles or positions held by individuals
New Auto-Interp
Negative Logits
ongyang
-0.70
itational
-0.67
ople
-0.66
rax
-0.66
usra
-0.64
fab
-0.62
fy
-0.62
raq
-0.61
rir
-0.61
reau
-0.60
POSITIVE LOGITS
pired
1.10
pires
0.94
regards
0.94
well
0.92
opposed
0.86
bestos
0.83
evidenced
0.80
piring
0.80
pire
0.73
stewards
0.72
Activations Density 0.159%