INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    oshenko
    -0.78
    tones
    -0.78
    cius
    -0.76
    abilia
    -0.73
    cdn
    -0.71
     Ukraine
    -0.71
    nos
    -0.70
    vt
    -0.68
    forum
    -0.67
    auer
    -0.65
    POSITIVE LOGITS
     destro
    0.68
    ppa
    0.67
     bulldo
    0.62
     Luffy
    0.61
     Franch
    0.60
    gers
    0.60
     scrub
    0.60
    aylor
    0.59
    ppo
    0.59
     Viol
    0.56
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.