INDEX
Explanations
references to religious terms and phrases associated with peace and blessings
New Auto-Interp
Negative Logits
bose
-0.16
ux
-0.15
¼
-0.14
ppard
-0.14
elay
-0.14
iger
-0.14
olini
-0.14
ιβ
-0.14
dee
-0.14
azon
-0.14
POSITIVE LOGITS
HL
0.15
ÑĢо
0.14
avers
0.14
aver
0.14
ERA
0.14
Denn
0.13
WithValue
0.13
aza
0.13
.nasa
0.13
lòng
0.13
Activations Density 0.012%