INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     OnInit
    -0.07
     Hancock
    -0.07
     thật
    -0.07
     borr
    -0.07
    olumbia
    -0.06
    الث
    -0.06
    THON
    -0.06
    'ят
    -0.06
    estr
    -0.06
    SHORT
    -0.06
    POSITIVE LOGITS
     cage
    0.21
     Cage
    0.17
     cages
    0.15
    age
    0.09
    AGE
    0.09
     crates
    0.08
     tame
    0.07
    0.07
    ages
    0.07
     unp
    0.07
    Act Density 0.002%

    No Known Activations