INDEX
Explanations
mentions of faculty members or their roles within an academic context
New Auto-Interp
Negative Logits
orph
-0.17
lege
-0.16
nie
-0.15
aldi
-0.15
vro
-0.15
nila
-0.14
bra
-0.14
bert
-0.14
bul
-0.14
ds
-0.14
POSITIVE LOGITS
member
0.24
/st
0.23
members
0.22
ulty
0.21
/student
0.20
Member
0.19
_member
0.18
æĪIJåijĺ
0.18
-st
0.17
member
0.17
Activations Density 0.007%