INDEX
Explanations
references to academic policies and expectations for students
New Auto-Interp
Negative Logits
arel
-0.18
utterstock
-0.16
opus
-0.15
kino
-0.15
obra
-0.14
arin
-0.14
yre
-0.14
incentiv
-0.14
moth
-0.14
nab
-0.14
POSITIVE LOGITS
SUN
0.19
academic
0.19
shall
0.18
student
0.18
acad
0.18
Academic
0.17
sanction
0.17
(cf
0.17
classroom
0.17
policy
0.16
Activations Density 0.010%