INDEX
Explanations
mentions of positions, roles, and titles
terms related to organizational structures and roles
New Auto-Interp
Negative Logits
ãĥĥãĥī
-0.75
Bir
-0.63
zan
-0.63
MAT
-0.62
geries
-0.58
mbuds
-0.57
GO
-0.57
à¼
-0.54
worldly
-0.54
ãĥĥ
-0.53
POSITIVE LOGITS
of
1.32
of
0.99
Of
0.86
thereof
0.82
OF
0.80
eatures
0.77
Of
0.73
OF
0.72
oft
0.70
for
0.70
Activations Density 0.854%