INDEX
    Explanations

    phrases where the speaker expresses a comparison or likeness between two things

    phrases indicating comparisons or descriptions

    New Auto-Interp
    Negative Logits
    heid
    -0.74
    fulness
    -0.71
    IAL
    -0.70
    orest
    -0.70
    jad
    -0.67
    ajor
    -0.66
     Yard
    -0.63
    ario
    -0.62
    ower
    -0.61
    pak
    -0.61
    POSITIVE LOGITS
     forget
    0.80
     luck
    0.79
     kinda
    0.75
     neat
    0.73
     screwed
    0.72
     shit
    0.70
     messed
    0.69
     scary
    0.68
     tease
    0.67
     fun
    0.67
    Act Density 0.040%

    No Known Activations