INDEX
Explanations
questions and discussions relating to personal academic experiences and processes
New Auto-Interp
Negative Logits
_PB
-0.16
erville
-0.14
PD
-0.14
BLL
-0.14
apter
-0.14
ondon
-0.14
values
-0.14
undef
-0.14
facts
-0.14
ogen
-0.13
POSITIVE LOGITS
University
0.36
University
0.34
UNIVERSITY
0.32
جاÙħعة
0.28
Loy
0.28
Universidad
0.28
Univ
0.28
UT
0.27
Duke
0.26
UV
0.26
Activations Density 0.804%