INDEX
Explanations
references to religious figures and concepts related to spirituality
New Auto-Interp
Negative Logits
ÚĨÛĮ
-0.16
rame
-0.16
rotch
-0.15
aru
-0.15
avig
-0.15
ál
-0.15
createState
-0.14
Bis
-0.14
agu
-0.14
Ziel
-0.14
POSITIVE LOGITS
cruc
0.22
repro
0.21
slew
0.20
insult
0.19
adj
0.19
reb
0.18
cast
0.17
convers
0.17
beh
0.17
dep
0.17
Activations Density 0.379%