INDEX
Explanations
phrases related to uncertainty or ambiguity
New Auto-Interp
Negative Logits
probably
-0.17
Leban
-0.16
iple
-0.15
probably
-0.15
either
-0.15
دار
-0.15
iman
-0.15
ocale
-0.15
Probably
-0.14
might
-0.14
POSITIVE LOGITS
anything
0.32
ever
0.29
anything
0.28
Anything
0.27
EVER
0.25
Anything
0.25
indeed
0.24
anywhere
0.24
weren
0.23
ANY
0.22
Activations Density 0.067%