INDEX
Explanations
geographic locations and institutional names
New Auto-Interp
Negative Logits
denomin
-0.18
YC
-0.17
denom
-0.16
OBS
-0.15
idon
-0.15
Tobias
-0.15
assin
-0.14
kit
-0.14
Bison
-0.14
.AI
-0.14
POSITIVE LOGITS
Weber
0.23
aday
0.18
iggs
0.18
801
0.17
Hatch
0.17
Hunts
0.17
Uint
0.17
poly
0.16
Utah
0.16
angu
0.16
Activations Density 0.034%