INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    cape
    -0.67
    Knight
    -0.67
    mingham
    -0.66
     WATCHED
    -0.66
     Meal
    -0.65
    Chart
    -0.63
    rise
    -0.63
    Tem
    -0.62
    iami
    -0.62
     Overt
    -0.60
    POSITIVE LOGITS
    rons
    0.73
    unda
    0.69
     retarded
    0.67
     Vu
    0.66
     greenhouse
    0.66
     buggy
    0.65
     Phi
    0.64
    oppy
    0.63
     reins
    0.61
    ELD
    0.61
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.