INDEX
Explanations
mentions of Harvard University
references to Harvard University
New Auto-Interp
Negative Logits
alez
-0.76
ichick
-0.74
oute
-0.73
eur
-0.71
wagon
-0.71
phabet
-0.70
DonaldTrump
-0.69
repeat
-0.67
afort
-0.67
atron
-0.66
POSITIVE LOGITS
University
1.20
uates
1.00
Yard
0.97
Crimson
0.93
College
0.89
Graduate
0.88
Law
0.88
University
0.87
Lect
0.86
Medical
0.85
Activations Density 0.029%