INDEX
    Explanations

    conjugation

    New Auto-Interp
    Negative Logits
     continents
    -0.09
    Inspir
    -0.08
    Lobby
    -0.08
     Pent
    -0.08
    elsey
    -0.08
    еце
    -0.08
    representation
    -0.08
     mej
    -0.08
    Representation
    -0.08
    repr
    -0.08
    POSITIVE LOGITS
     élég
    0.08
    0.07
     شر
    0.07
    ploader
    0.07
     teamed
    0.07
    /end
    0.07
    れる
    0.07
    ashi
    0.07
     boyfriend
    0.07
    луж
    0.07
    Act Density 0.010%

    No Known Activations