INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    efficients
    -0.77
    fman
    -0.73
    elman
    -0.71
    egu
    -0.70
     Quincy
    -0.69
    oki
    -0.69
    ante
    -0.68
    iren
    -0.67
    ierrez
    -0.67
    reau
    -0.66
    POSITIVE LOGITS
    regon
    0.69
    CN
    0.65
     padd
    0.63
     strip
    0.63
     merge
    0.62
     Soccer
    0.59
     sport
    0.59
     lett
    0.59
     fencing
    0.58
     nort
    0.58
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.