INDEX
    Explanations

    instances of the word "same."

    New Auto-Interp
    Negative Logits
     DiCaprio
    -0.78
    hört
    -0.71
    Jegyzetek
    -0.70
     commerciales
    -0.67
     propOrder
    -0.67
    RegressionTest
    -0.66
    setupUi
    -0.66
     Ellison
    -0.65
    tipped
    -0.64
    {}/
    -0.64
    POSITIVE LOGITS
     same
    2.17
    Same
    2.15
    SAME
    2.12
    same
    2.08
     Same
    2.02
     SAME
    1.99
     samme
    1.62
     samma
    1.50
     mesma
    1.46
    isSame
    1.36
    Act Density 0.134%

    No Known Activations