INDEX
Explanations
professor titles and affiliated universities
affiliations to academic institutions
New Auto-Interp
Negative Logits
200000
-0.71
wagon
-0.68
0000000000000000
-0.67
ãĤª
-0.67
kneeling
-0.67
âĶ
-0.66
upside
-0.66
xual
-0.65
proxies
-0.65
Accessory
-0.65
POSITIVE LOGITS
NYU
1.55
Harvard
1.52
Yale
1.47
Cornell
1.47
University
1.46
Rutgers
1.46
Stanford
1.43
Princeton
1.43
Carnegie
1.43
UCLA
1.41
Activations Density 0.112%