INDEX
Explanations
references to roles or positions within organizations
New Auto-Interp
Negative Logits
erson
-0.15
uba
-0.14
hom
-0.14
kur
-0.13
cratch
-0.13
resher
-0.13
CFG
-0.13
lope
-0.13
short
-0.13
iÄįka
-0.13
POSITIVE LOGITS
jam
0.15
itsu
0.15
DISP
0.15
ãĥĭãĥĥãĤ¯
0.15
-widgets
0.15
URA
0.14
ÙģØ§Ø±
0.14
ucken
0.14
amba
0.14
αÏģα
0.14
Activations Density 0.147%