INDEX
Explanations
references to the University of Oxford and related terms
New Auto-Interp
Negative Logits
odd
-0.16
Äįet
-0.15
ÎŃÏģγ
-0.15
NewLabel
-0.14
ighted
-0.14
uru
-0.14
Danh
-0.14
anke
-0.14
ypse
-0.14
omor
-0.14
POSITIVE LOGITS
shire
0.33
University
0.25
Companion
0.19
Bro
0.19
university
0.19
University
0.18
Isis
0.17
UNIVERSITY
0.16
comma
0.16
Uni
0.16
Activations Density 0.008%