INDEX
Explanations
references to religious symbols and deities
New Auto-Interp
Negative Logits
Mob
-0.15
ç±
-0.15
orian
-0.15
elden
-0.15
quito
-0.14
ÙģÙĤ
-0.14
ilde
-0.14
pike
-0.14
жÑĥ
-0.14
kent
-0.14
POSITIVE LOGITS
sway
0.21
Lord
0.20
Bra
0.20
Ling
0.20
ling
0.19
Sw
0.19
Dev
0.19
Tray
0.18
Lord
0.18
Trim
0.18
Activations Density 0.253%