INDEX
Explanations
Proper nouns, specifically names of people
occurrences of the word "aut."
New Auto-Interp
Negative Logits
mary
-0.73
borough
-0.70
BLE
-0.69
nce
-0.68
WAY
-0.66
earchers
-0.66
MIT
-0.66
function
-0.65
mers
-0.65
Beir
-0.63
POSITIVE LOGITS
ilus
1.06
emort
1.02
iful
0.94
umn
0.91
ographed
0.90
ograph
0.90
opsy
0.90
ica
0.89
ographs
0.88
ilia
0.86
Activations Density 0.020%