INDEX
Explanations
phrases expressing uncertainty or lack of knowledge
New Auto-Interp
Negative Logits
ITT
-0.15
alez
-0.15
zk
-0.15
efon
-0.15
ķĮ
-0.14
âĻª
-0.14
ardless
-0.14
uzzi
-0.14
eln
-0.14
shaw
-0.14
POSITIVE LOGITS
about
0.48
about
0.39
åħ³äºİ
0.35
ABOUT
0.33
About
0.33
About
0.32
_about
0.31
tentang
0.30
.about
0.28
دربارÙĩ
0.26
Activations Density 0.053%