INDEX
Explanations
phrases indicating potential actions or possibilities
it is possible to + verb
New Auto-Interp
Negative Logits
offense
-0.38
fellow
-0.38
its
-0.36
NEIGH
-0.35
it
-0.35
йо
-0.34
signOut
-0.34
promotion
-0.34
SOUT
-0.33
durer
-0.33
POSITIVE LOGITS
ניתן
0.73
canst
0.69
można
0.68
ניתן
0.66
müm
0.65
можно
0.63
אפשר
0.62
สามารถ
0.62
Vidite
0.61
MethodManager
0.61
Activations Density 0.019%