INDEX
Explanations
references to Christianity and religious themes
New Auto-Interp
Negative Logits
secular
-0.19
ephir
-0.17
rait
-0.15
aktu
-0.15
validationResult
-0.15
ibr
-0.15
urse
-0.15
atheist
-0.14
ÙģÙĨ
-0.14
оÑı
-0.14
POSITIVE LOGITS
gloss
0.15
sup
0.15
Christ
0.14
ard
0.14
enza
0.14
locker
0.14
gray
0.14
Xt
0.13
Nic
0.13
Deliver
0.13
Activations Density 0.881%