INDEX
Explanations
mentions of universities, especially Oxford and Princeton
mentions of prestigious universities, particularly Oxford and Princeton
New Auto-Interp
Negative Logits
quo
-0.77
HUD
-0.71
Magikarp
-0.71
ACTION
-0.68
rill
-0.68
venge
-0.65
rollers
-0.65
RANT
-0.64
rette
-0.63
lein
-0.63
POSITIVE LOGITS
shire
1.53
University
1.03
Analy
0.93
Circus
0.89
Lect
0.88
Universities
0.87
comma
0.87
Teaching
0.85
laureate
0.84
Scholars
0.83
Activations Density 0.042%