INDEX
Explanations
references to behavioral guidelines and moral teachings
New Auto-Interp
Negative Logits
peria
-0.18
apur
-0.17
elix
-0.15
swick
-0.15
ç¤
-0.14
ÑĢай
-0.14
_peer
-0.14
maur
-0.14
ritz
-0.14
æ¿
-0.14
POSITIVE LOGITS
Acts
0.18
Acts
0.18
Revel
0.18
Gal
0.18
clin
0.17
chapter
0.16
985
0.16
Phil
0.16
Romans
0.16
kj
0.15
Activations Density 0.085%