INDEX
    Explanations

    instances of the word "same" and its variations to indicate similarity or comparison

    New Auto-Interp
    Negative Logits
     DiCaprio
    -0.68
     commerciales
    -0.66
    ardı
    -0.63
    hört
    -0.62
    Jegyzetek
    -0.61
    tipped
    -0.61
    pios
    -0.60
     zufolge
    -0.59
    {}/
    -0.59
    setupUi
    -0.59
    POSITIVE LOGITS
    SAME
    1.87
     same
    1.85
    Same
    1.80
     SAME
    1.74
    same
    1.72
     Same
    1.70
     samme
    1.45
     samma
    1.32
    isSame
    1.23
     demselben
    1.23
    Act Density 0.115%

    No Known Activations