INDEX
    Explanations

    phrases and expressions that convey feelings and opinions about quality, improvement, or disappointment

    New Auto-Interp
    Negative Logits
    atik
    -0.18
    ouch
    -0.18
    nal
    -0.16
    ubar
    -0.16
    isko
    -0.15
     Bias
    -0.15
    bak
    -0.15
    ceed
    -0.14
    urance
    -0.14
    zych
    -0.14
    POSITIVE LOGITS
     better
    0.88
    better
    0.76
     Better
    0.75
    Better
    0.70
     BET
    0.66
     bet
    0.63
     mejor
    0.63
     лÑĥÑĩÑĪе
    0.60
     besser
    0.58
     melhor
    0.58
    Act Density 0.251%

    No Known Activations