INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    nir
    -0.84
    ANK
    -0.82
    esta
    -0.79
    heddar
    -0.77
    flies
    -0.73
    bush
    -0.72
    Greek
    -0.71
     Yamato
    -0.69
    rain
    -0.69
    ¿
    -0.68
    POSITIVE LOGITS
    eatures
    0.86
     boycott
    0.84
    eport
    0.71
     refrain
    0.68
     discourse
    0.65
     recreation
    0.65
     abstinence
    0.64
    ruciating
    0.63
     clinch
    0.63
     impossibility
    0.62
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.