INDEX
Explanations
mentions of the American Association of University Professors (AAUP)
mentions of academic ratings or classifications
New Auto-Interp
Negative Logits
anooga
-0.81
Jenner
-0.75
orius
-0.72
ophers
-0.71
lov
-0.71
gro
-0.71
fully
-0.71
hunt
-0.70
ified
-0.69
nets
-0.68
POSITIVE LOGITS
BILITIES
0.96
ccess
0.88
BILITY
0.88
zona
0.88
HHHH
0.82
BIL
0.81
qua
0.81
ppo
0.78
HAHAHAHA
0.77
ño
0.77
Activations Density 0.039%