INDEX
    Explanations

    words that convey strong positive emotions or highlight exceptional qualities

    New Auto-Interp
    Negative Logits
    agra
    -1.65
    ainer
    -1.64
    ves
    -1.57
    ters
    -1.57
    heses
    -1.56
    tico
    -1.55
    tern
    -1.53
    Åij
    -1.53
    uelle
    -1.53
    ár
    -1.53
    POSITIVE LOGITS
    µ
    1.90
    ¨
    1.82
    ľĵ
    1.78
    ®
    1.73
     amounts
    1.72
    Ī
    1.72
    Ł
    1.70
    č↵     
    1.68
    1.68
    1.68
    Act Density 0.481%

    No Known Activations