INDEX
    Explanations

    phrases indicating the effectiveness or superiority of actions and decisions

    New Auto-Interp
    Negative Logits
    ieri
    -0.17
    maal
    -0.15
    akis
    -0.15
     ÐŁÐ¾ÑĤ
    -0.14
    äºķ
    -0.14
     Pot
    -0.14
    ç¨ĭ度
    -0.14
    MAND
    -0.14
    552
    -0.14
    زÙĦ
    -0.14
    POSITIVE LOGITS
     course
    0.35
     bet
    0.34
     way
    0.32
     choice
    0.32
     option
    0.29
     thing
    0.28
    course
    0.27
     bets
    0.27
     Course
    0.27
     move
    0.27
    Act Density 0.096%

    No Known Activations