INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tako
    -0.08
     investors
    -0.07
     waar
    -0.07
    ورية
    -0.07
     доп
    -0.06
     fier
    -0.06
     enfants
    -0.06
     emploi
    -0.06
     fab
    -0.06
     маль
    -0.06
    POSITIVE LOGITS
    	Code
    0.07
    IFICATE
    0.06
    прав
    0.06
    uyển
    0.06
    init
    0.06
    gricult
    0.06
     innate
    0.06
    Fixed
    0.06
    wk
    0.06
     Gamma
    0.06
    Act Density 0.002%

    No Known Activations