INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     liner
    -0.76
    pecially
    -0.74
    0010
    -0.71
     contamin
    -0.65
    retty
    -0.65
     certific
    -0.65
     monop
    -0.63
    ishly
    -0.63
    orously
    -0.62
     param
    -0.60
    POSITIVE LOGITS
     Fathers
    0.74
    ador
    0.73
    utral
    0.73
    tics
    0.72
    ãĥ¼ãĥ³
    0.72
     Deliver
    0.67
     Workers
    0.67
    xon
    0.66
    zek
    0.66
     Employees
    0.66
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.