INDEX
Explanations
professors and their affiliated universities
phrases that indicate academic positions and their affiliations
New Auto-Interp
Negative Logits
artifacts
-0.75
respir
-0.73
éĹĺ
-0.73
coffin
-0.71
proxy
-0.71
inval
-0.70
activ
-0.69
horizont
-0.67
ãĤª
-0.66
upside
-0.66
POSITIVE LOGITS
Georgetown
1.44
NYU
1.43
Yale
1.43
Rutgers
1.42
Princeton
1.39
Harvard
1.39
Northwestern
1.38
Syracuse
1.34
Cornell
1.32
Dartmouth
1.31
Activations Density 0.113%