INDEX
Explanations
university-related events and references
references to academic institutions and specific political issues or controversies
New Auto-Interp
Negative Logits
abase
-0.65
ij士
-0.62
ificant
-0.57
seiz
-0.56
theless
-0.56
forth
-0.56
divided
-0.55
Wem
-0.54
GROUP
-0.53
izontal
-0.53
POSITIVE LOGITS
.</
0.81
!".
0.71
'.
0.71
".
0.71
?".
0.70
.[
0.70
.?
0.70
etc
0.68
$.
0.68
*.
0.67
Activations Density 0.736%