INDEX
Negative Logits
characteristics
-0.08
invit
-0.08
farben
-0.08
Kry
-0.07
Erlebnis
-0.07
깔
-0.07
-indent
-0.07
provoking
-0.07
caos
-0.07
Characteristics
-0.07
POSITIVE LOGITS
限制
0.13
ограничения
0.12
restricciones
0.11
beperk
0.10
restrictions
0.10
imposed
0.10
censorship
0.10
Restrictions
0.10
Restr
0.10
Restrictions
0.10
Activations Density 0.008%