INDEX
Explanations
mentions of students
references to students
New Auto-Interp
Negative Logits
rous
-0.72
neum
-0.68
UTERS
-0.68
Cape
-0.63
ality
-0.62
osp
-0.59
nect
-0.59
SHIP
-0.59
bilateral
-0.59
orse
-0.58
POSITIVE LOGITS
hip
0.91
uates
0.85
girls
0.81
enrolled
0.80
boys
0.77
tuition
0.77
bridge
0.76
hips
0.76
haw
0.74
tu
0.74
Activations Density 0.033%