INDEX
Explanations
instances of the word "school"
references to scholarly publications or academic achievements
New Auto-Interp
Negative Logits
GOODMAN
-0.71
whales
-0.68
underwater
-0.63
disapp
-0.62
sweats
-0.59
Mub
-0.59
inctions
-0.58
Dying
-0.58
trainers
-0.56
nonexistent
-0.56
POSITIVE LOGITS
uble
0.94
interstitial
0.85
enberg
0.82
kov
0.80
enegger
0.79
aldi
0.79
itzer
0.79
feld
0.77
acher
0.77
schild
0.77
Activations Density 0.068%