INDEX
    Explanations

    positive adjectives

    positive descriptors and evaluations of quality

    New Auto-Interp
    Negative Logits
    meric
    -0.69
    HQ
    -0.66
    mere
    -0.65
    pora
    -0.65
    berto
    -0.61
    minecraft
    -0.61
    atars
    -0.60
    uala
    -0.60
     Peb
    -0.58
    ubi
    -0.58
    POSITIVE LOGITS
     indeed
    0.78
     nails
    0.76
     sounding
    0.74
    nered
    0.68
     understatement
    0.68
     wound
    0.67
     nowadays
    0.61
     spoiler
    0.61
     (>
    0.61
     sailing
    0.61
    Act Density 0.115%

    No Known Activations