INDEX
Explanations
affirmations and expressions of personal opinion
New Auto-Interp
Negative Logits
Sail
-0.17
estre
-0.16
ullo
-0.16
ÑĴ
-0.15
]={↵-0.15
´Ŀ
-0.14
ä¸įå¾Ĺ
-0.14
ราย
-0.14
æĺŃ
-0.14
whereas
-0.14
POSITIVE LOGITS
yes
0.33
yes
0.32
Yes
0.27
Yes
0.27
YES
0.26
indeed
0.26
YES
0.23
=yes
0.21
Yep
0.19
.Yes
0.19
Activations Density 0.100%