INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    ingen
    -0.75
     tox
    -0.69
    iii
    -0.67
    char
    -0.65
    otent
    -0.64
    udic
    -0.64
    vind
    -0.64
    igne
    -0.62
    ii
    -0.62
    dam
    -0.61
    POSITIVE LOGITS
     range
    1.81
     Range
    1.39
     ranges
    1.18
    range
    1.11
     relative
    0.89
     Walters
    0.81
     lower
    0.78
     Lower
    0.76
    Range
    0.75
    orsi
    0.74
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.