INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     turn
    -1.58
     turning
    -1.32
    Turn
    -1.27
     Turn
    -1.20
    turn
    -1.17
     TURN
    -1.13
     turned
    -1.13
     turns
    -1.05
     turno
    -1.05
    TURN
    -1.02
    POSITIVE LOGITS
     resorted
    1.15
     recourse
    1.13
     resort
    0.94
     again
    0.93
     снова
    0.93
     recurrir
    0.88
    again
    0.83
    resort
    0.81
     vaak
    0.80
     to
    0.79
    Act Density 0.019%

    No Known Activations