INDEX
    Explanations

    helpful and useful feedback

    New Auto-Interp
    Negative Logits
     usable
    0.91
     meaningful
    0.86
     useful
    0.80
    useful
    0.76
     meaningfully
    0.75
     usefulness
    0.75
    Useful
    0.73
     مفید
    0.73
     bermanfaat
    0.73
    有用
    0.72
    POSITIVE LOGITS
     extremely
    0.67
     incredibly
    0.64
     immensely
    0.60
     Extremely
    0.59
     hel
    0.59
     insanely
    0.56
     estremamente
    0.56
    extremely
    0.55
     unbelievably
    0.55
     cực
    0.51
    Act Density 0.022%

    No Known Activations