INDEX
Explanations
references to spiritual or religious concepts
New Auto-Interp
Negative Logits
Gab
-0.17
Gab
-0.16
Herz
-0.15
forth
-0.15
Gabriel
-0.15
zeit
-0.14
ToFront
-0.14
enez
-0.14
iterr
-0.14
ront
-0.14
POSITIVE LOGITS
loh
0.17
æĮ¯ãĤĬ
0.15
aret
0.15
Karn
0.14
hi
0.14
isch
0.14
Trident
0.14
매
0.14
tings
0.14
ziej
0.14
Activations Density 0.001%