INDEX
Explanations
academic fields and programs
references to various academic disciplines or fields of study
New Auto-Interp
Negative Logits
defective
-0.63
Lann
-0.62
cakes
-0.61
unal
-0.59
IER
-0.59
amino
-0.59
phabet
-0.57
emonic
-0.57
entity
-0.57
dreaded
-0.56
POSITIVE LOGITS
uggest
0.94
udo
0.93
krit
0.93
chool
0.92
manship
0.89
ilk
0.88
abroad
0.88
hig
0.86
scholar
0.83
hell
0.83
Activations Density 0.022%