INDEX
    Explanations

    The neuron detects occurrences of the exact phrase “same thing.”

    New Auto-Interp
    Negative Logits
    ama
    -0.06
    ์)
    -0.06
    Hola
    -0.06
     một
    -0.06
    _it
    -0.06
     &___
    -0.06
     apps
    -0.06
    -0.06
     gatherings
    -0.06
     těla
    -0.06
    POSITIVE LOGITS
     최저
    0.07
    Concrete
    0.07
    .vertices
    0.07
     Riding
    0.07
    .only
    0.06
    と思
    0.06
    Coefficient
    0.06
    :g
    0.06
     holders
    0.06
    .calculate
    0.06
    Act Density 0.011%

    No Known Activations