INDEX
Explanations
phrases indicating possession or ownership
New Auto-Interp
Negative Logits
frog
-0.15
istr
-0.15
onso
-0.14
adas
-0.14
ogn
-0.14
ÏĨι
-0.14
ocking
-0.13
angen
-0.13
chin
-0.13
аÑĢÑĩ
-0.13
POSITIVE LOGITS
questions
0.29
any
0.23
questions
0.20
Questions
0.20
trouble
0.19
ever
0.18
question
0.18
spare
0.18
uest
0.18
aeper
0.18
Activations Density 0.078%