INDEX
Explanations
references to sensitive societal issues, particularly relating to consent and interpersonal relationships
New Auto-Interp
Negative Logits
uide
-0.16
amma
-0.15
ius
-0.15
aleb
-0.14
sch
-0.14
your
-0.14
jour
-0.14
akk
-0.14
touch
-0.14
inters
-0.14
POSITIVE LOGITS
sometimes
0.27
sometimes
0.26
usually
0.25
variably
0.24
usually
0.24
ometimes
0.24
often
0.22
often
0.21
either
0.20
либо
0.20
Activations Density 0.308%