INDEX
Explanations
key historical figures and their contributions to science
New Auto-Interp
Negative Logits
orget
-0.14
Pazar
-0.14
viso
-0.14
ëŀĮ
-0.14
Fior
-0.14
ÄĻż
-0.14
-Za
-0.14
ondon
-0.14
ngen
-0.13
IO
-0.13
POSITIVE LOGITS
method
0.17
OLON
0.16
ór
0.15
ury
0.14
jiang
0.14
jaw
0.14
erton
0.14
uji
0.14
Mansion
0.14
Manor
0.13
Activations Density 0.082%