INDEX
Explanations
conditional phrases indicating potential actions or outcomes
New Auto-Interp
Negative Logits
raya
-0.18
odds
-0.15
åĿĽ
-0.15
ahun
-0.15
ãģĭãĤĭ
-0.15
inkle
-0.15
493
-0.15
alte
-0.14
ajar
-0.14
çĦ¼
-0.14
POSITIVE LOGITS
agine
0.16
ì§ij
0.15
Mood
0.15
958
0.15
ocate
0.14
Santos
0.14
feed
0.14
esign
0.14
èĥİ
0.14
rend
0.14
Activations Density 0.237%