INDEX
Explanations
references to academic roles and affiliations
New Auto-Interp
Negative Logits
èĵ
-0.18
anza
-0.15
á»§a
-0.15
ukkan
-0.15
hai
-0.15
èŀº
-0.14
ÙĪÙħÛĮ
-0.14
heit
-0.14
à¸Ńà¸Ķ
-0.14
emean
-0.14
POSITIVE LOGITS
Cambridge
0.29
Gon
0.26
Tutorial
0.24
Pemb
0.24
Trinity
0.24
Oxford
0.22
tutorial
0.22
Bras
0.22
Cam
0.21
Jesus
0.21
Activations Density 0.127%