INDEX
Explanations
references to educational institutions, specifically universities and colleges
New Auto-Interp
Negative Logits
ifer
-0.16
ido
-0.15
avs
-0.15
adel
-0.14
Marg
-0.14
orrow
-0.14
abor
-0.14
eno
-0.14
color
-0.14
ima
-0.14
POSITIVE LOGITS
OfSize
0.16
sko
0.15
зÑĭ
0.14
Ø·Ùĩ
0.14
quote
0.14
assel
0.14
okit
0.14
oplevel
0.14
rella
0.14
opup
0.14
Activations Density 0.052%