INDEX
Explanations
references to various universities and geographic locations
New Auto-Interp
Negative Logits
irt
-0.16
аÑĤа
-0.15
atha
-0.15
zza
-0.15
assa
-0.14
uy
-0.14
çķ¥
-0.14
akh
-0.14
еÑģа
-0.14
aux
-0.13
POSITIVE LOGITS
Southern
0.19
Notre
0.18
Evans
0.18
Phoenix
0.17
Northern
0.17
redient
0.17
Bridge
0.16
phoenix
0.16
Santo
0.16
Pike
0.16
Activations Density 0.012%