INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     temple
    -0.07
     Temple
    -0.07
     Hope
    -0.06
     Stem
    -0.06
     DAY
    -0.06
    uro
    -0.06
     monument
    -0.06
     Day
    -0.06
    -win
    -0.06
     Restr
    -0.06
    POSITIVE LOGITS
     ByVal
    0.07
    batis
    0.07
     handguns
    0.07
    ってきた
    0.06
    addle
    0.06
    received
    0.06
     obě
    0.06
     многих
    0.06
     나를
    0.06
    adle
    0.06
    Act Density 0.029%

    No Known Activations