INDEX
Explanations
references to religious beliefs and themes
New Auto-Interp
Negative Logits
359
-0.16
lesia
-0.15
ileo
-0.15
652
-0.15
355
-0.14
ceremonial
-0.14
Looper
-0.14
deen
-0.14
bachelor
-0.14
845
-0.14
POSITIVE LOGITS
appar
0.34
Fat
0.30
Fat
0.26
FAT
0.24
.Fat
0.23
fat
0.22
fat
0.22
-fat
0.22
Our
0.19
devotion
0.19
Activations Density 0.062%