INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     זמן
    -0.07
     receptions
    -0.07
     civilized
    -0.07
    сложн
    -0.07
    -0.07
     Körper
    -0.06
     Farmer
    -0.06
    Taken
    -0.06
    markt
    -0.06
    .Diff
    -0.06
    POSITIVE LOGITS
    0.08
    0.07
    лив
    0.07
    альных
    0.06
     Jane
    0.06
    0.06
    0.06
    /red
    0.06
    ير
    0.06
    /load
    0.06
    Act Density 0.011%

    No Known Activations