INDEX
    Explanations

    comparisons between different situations or scenarios

    comparative phrases emphasizing the notion of similarity

    New Auto-Interp
    Negative Logits
    hiba
    -0.89
    ulty
    -0.86
    yrinth
    -0.81
    iland
    -0.78
    bard
    -0.78
    iband
    -0.78
    acia
    -0.75
    iple
    -0.73
    icity
    -0.71
    ishop
    -0.70
    POSITIVE LOGITS
    lihood
    1.43
    lier
    1.16
    liest
    1.10
    liness
    0.88
     Nor
    0.78
     nor
    0.74
     ours
    0.72
    erous
    0.71
    able
    0.70
     anything
    0.70
    Act Density 0.040%

    No Known Activations