INDEX
Explanations
references to corrections and amendments in texts
New Auto-Interp
Negative Logits
uki
-0.16
mars
-0.15
fres
-0.15
occasion
-0.15
hek
-0.14
ensa
-0.14
grav
-0.14
progress
-0.14
eger
-0.14
ropa
-0.14
POSITIVE LOGITS
lew
0.17
cono
0.16
prus
0.15
ìŀ¬
0.15
grass
0.15
pub
0.15
pubs
0.15
gv
0.14
ìĬ
0.14
pub
0.14
Activations Density 0.010%