INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Croat
    -0.07
     IDs
    -0.07
    riend
    -0.07
    iani
    -0.06
     foam
    -0.06
     boyc
    -0.06
     Jam
    -0.06
     Diana
    -0.06
     Candy
    -0.06
     Zucker
    -0.06
    POSITIVE LOGITS
     firearms
    0.07
     Gun
    0.07
    ültür
    0.07
    ình
    0.07
    instance
    0.06
    Individual
    0.06
    ikler
    0.06
    0.06
     firearm
    0.06
     उसक
    0.06
    Act Density 0.006%

    No Known Activations