INDEX
Explanations
mentions of software development and technical details
New Auto-Interp
Negative Logits
715
-0.16
-transitional
-0.15
bens
-0.14
idot
-0.14
abox
-0.14
yonel
-0.14
uger
-0.14
enment
-0.14
haul
-0.14
PILE
-0.14
POSITIVE LOGITS
dann
0.24
grave
0.22
grav
0.21
mortal
0.19
fer
0.19
ister
0.18
ro
0.18
fur
0.18
mort
0.18
violent
0.18
Activations Density 0.003%