INDEX
Explanations
references to religious and spiritual concepts
New Auto-Interp
Negative Logits
inters
-0.15
atan
-0.14
pez
-0.14
ÑĢеÑĪ
-0.14
lander
-0.13
none
-0.13
fty
-0.13
inez
-0.13
ertz
-0.12
olkien
-0.12
POSITIVE LOGITS
term
0.54
word
0.39
Term
0.36
Term
0.36
term
0.36
TERM
0.34
_term
0.32
-term
0.31
phrase
0.31
TERM
0.30
Activations Density 0.497%