INDEX
Explanations
indicators of audience engagement and self-identification
New Auto-Interp
Negative Logits
uez
-0.17
structural
-0.16
Structural
-0.16
igsaw
-0.15
ensch
-0.15
addy
-0.15
Ñī
-0.15
ertz
-0.15
rov
-0.14
uala
-0.14
POSITIVE LOGITS
fur
0.18
jste
0.16
etty
0.16
yourself
0.15
fur
0.15
.echo
0.14
hopefully
0.14
yourselves
0.14
piel
0.14
ãģ¾ãģł
0.14
Activations Density 0.124%