INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     implanted
    -0.69
     unnamed
    -0.68
     unused
    -0.64
     fictitious
    -0.64
     tampering
    -0.62
     homegrown
    -0.62
     oldest
    -0.62
     estimated
    -0.61
     imported
    -0.61
     counterfeit
    -0.60
    POSITIVE LOGITS
     hers
    0.99
    ttes
    0.88
    zos
    0.86
    iquette
    0.83
    uci
    0.76
    phas
    0.75
    nings
    0.74
    vre
    0.73
    rang
    0.72
    anship
    0.72
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.