INDEX
    Explanations

    phrases that express commendation or well-wishing

    New Auto-Interp
    Negative Logits
    ViewFeatures
    -1.01
    weile
    -0.77
    Linki
    -0.73
    companied
    -0.71
    ‍♀️
    -0.70
     Cummings
    -0.69
    ]="
    -0.69
    Geografie
    -0.68
    defn
    -0.67
    ęku
    -0.67
    POSITIVE LOGITS
     BEST
    1.78
    BEST
    1.74
    best
    1.68
    Best
    1.64
     best
    1.62
     Best
    1.59
    melhor
    1.31
     Mejor
    1.16
     beſt
    1.15
     terbaik
    1.09
    Act Density 0.069%

    No Known Activations