INDEX
Explanations
alternative conditional phrases or options in statements
New Auto-Interp
Negative Logits
oria
-0.14
ame
-0.14
etus
-0.14
ushman
-0.13
ëĵ±
-0.13
chas
-0.13
ιά
-0.13
atas
-0.13
/or
-0.13
ares
-0.12
POSITIVE LOGITS
rather
0.49
rather
0.38
Rather
0.38
Rather
0.34
plutôt
0.29
more
0.28
spÃŃÅ¡e
0.27
better
0.23
should
0.23
maybe
0.23
Activations Density 0.086%