INDEX
Explanations
comparisons between original and imitation items or works
New Auto-Interp
Negative Logits
strup
-0.15
alo
-0.14
achen
-0.13
лоп
-0.13
thinkable
-0.13
miesz
-0.13
oux
-0.13
rve
-0.13
atra
-0.13
ople
-0.12
POSITIVE LOGITS
original
1.30
original
1.11
originals
1.02
Original
1.02
ORIGINAL
0.98
-original
0.96
Original
0.95
originally
0.94
.original
0.88
åİŁ
0.88
Activations Density 0.458%