INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    */
    
    
    -0.43
    IconModule
    -0.42
     CanadaChoose
    -0.41
    manha
    -0.40
    afy
    -0.40
    scen
    -0.40
    grenze
    -0.39
     Dickerson
    -0.39
    icode
    -0.39
    isamment
    -0.37
    POSITIVE LOGITS
     refer
    0.75
     referred
    0.73
    refer
    0.68
     referring
    0.65
    referred
    0.64
    Refer
    0.63
     REFER
    0.62
     Refer
    0.60
     مرئيه
    0.57
    referent
    0.56
    Act Density 0.020%

    No Known Activations