INDEX
Explanations
words that are misspelled or have typographical errors
instances of the word "err."
New Auto-Interp
Negative Logits
creen
-0.93
esville
-0.81
eph
-0.76
±
-0.71
ciating
-0.70
ķ
-0.67
vention
-0.64
Ŀ
-0.61
figure
-0.61
ļé
-0.61
POSITIVE LOGITS
untled
1.04
antly
0.97
ands
0.96
rr
0.96
idge
0.95
utherford
0.93
abbit
0.93
uti
0.90
andom
0.90
ange
0.88
Activations Density 0.041%