INDEX
Explanations
names of researchers and their affiliations
New Auto-Interp
Negative Logits
856
-0.15
duc
-0.15
urge
-0.14
933
-0.13
dy
-0.13
ReSharper
-0.13
oyer
-0.13
atte
-0.13
Toolkit
-0.13
ฯ
-0.13
POSITIVE LOGITS
Orc
0.24
Department
0.22
Department
0.20
corresponding
0.20
Departments
0.19
)↵↵↵↵↵↵↵↵
0.18
Correspond
0.17
correspondence
0.17
department
0.17
Division
0.16
Activations Density 0.051%