INDEX
    Explanations

    actions or activities described in various contexts

    New Auto-Interp
    Negative Logits
    urette
    -0.16
     Ñģпок
    -0.15
    áš
    -0.15
    ÑĮÑı
    -0.15
    paring
    -0.15
    	UP
    -0.15
    avra
    -0.14
    лÑİб
    -0.14
    rase
    -0.14
    uropean
    -0.14
    POSITIVE LOGITS
     away
    0.29
     everything
    0.28
     wonders
    0.27
     violence
    0.24
     unto
    0.24
    le
    0.24
     whatever
    0.23
     battle
    0.23
     justice
    0.23
     right
    0.22
    Act Density 0.067%

    No Known Activations