INDEX
Explanations
conditional phrases and questions regarding willingness or possibility
New Auto-Interp
Negative Logits
ئت
-0.14
eec
-0.14
neh
-0.14
ÏĦον
-0.14
lify
-0.14
ванов
-0.14
trusted
-0.13
ãģ¨ãģĨ
-0.13
arkin
-0.13
istani
-0.13
POSITIVE LOGITS
anyone
0.37
anybody
0.36
there
0.30
Anyone
0.29
anything
0.28
any
0.26
n
0.25
Anyone
0.24
any
0.24
Any
0.22
Activations Density 0.181%