INDEX
    Explanations

    twisted things and actions

    New Auto-Interp
    Negative Logits
    0.57
     is
    0.57
    йы
    0.54
    0
    0.53
    ell
    0.50
    is
    0.48
    ي
    0.48
     ropa
    0.47
    ět
    0.46
    5
    0.46
    POSITIVE LOGITS
     twisted
    1.14
     twisting
    1.11
    1.07
     twist
    1.02
     twists
    0.96
     distortion
    0.94
    twisted
    0.94
     Twisted
    0.91
     distorted
    0.91
     distort
    0.87
    Act Density 0.035%

    No Known Activations