INDEX
Explanations
references to trauma and related terms
New Auto-Interp
Negative Logits
yu
-0.17
deen
-0.15
tring
-0.15
yper
-0.15
ÑĢÑĥÑĤ
-0.15
criptive
-0.14
zar
-0.14
Dud
-0.14
rop
-0.14
isp
-0.14
POSITIVE LOGITS
uma
0.30
umatic
0.26
jectory
0.25
umas
0.25
jectories
0.24
ipse
0.24
umat
0.23
iteur
0.22
itors
0.21
pez
0.21
Activations Density 0.007%