INDEX
    Explanations

    repetitions of the word "same"

    New Auto-Interp
    Negative Logits
     propOrder
    -0.70
     DiCaprio
    -0.68
     האם
    -0.66
    hört
    -0.65
    setupUi
    -0.59
    {}/
    -0.59
    TRIBUN
    -0.58
    льше
    -0.58
    pios
    -0.57
    Jegyzetek
    -0.56
    POSITIVE LOGITS
     same
    2.31
    Same
    2.29
    SAME
    2.26
    same
    2.23
     Same
    2.15
     SAME
    2.12
     samme
    1.69
     samma
    1.67
     mesma
    1.55
     hetzelfde
    1.50
    Act Density 0.127%

    No Known Activations