INDEX
Explanations
conditional statements and modal verbs indicating potential outcomes
New Auto-Interp
Negative Logits
kaarangay
-0.94
تقاوى
-0.92
DeleteBehavior
-0.90
Jefus
-0.88
poffible
-0.88
Geplaatst
-0.87
myſelf
-0.86
leſs
-0.84
дописавши
-0.84
Personendaten
-0.82
POSITIVE LOGITS
be
0.94
have
0.80
also
0.76
0.73
being
0.66
0.65
not
0.65
make
0.64
a
0.62
.
0.60
Activations Density 0.269%