INDEX
    Explanations

    This neuron fires on terms describing long-range interactions or correlations in the text.

    New Auto-Interp
    Negative Logits
     Heather
    -0.07
     Query
    -0.06
    CustomAttributes
    -0.06
     mz
    -0.06
    mlx
    -0.06
     dw
    -0.06
     شهید
    -0.06
     истор
    -0.06
    -0.06
     bureauc
    -0.06
    POSITIVE LOGITS
    _LONG
    0.07
     하지
    0.07
     reach
    0.07
    ابة
    0.07
     concentrates
    0.07
     transported
    0.06
     charakter
    0.06
     действия
    0.06
     الإن
    0.06
     Lesson
    0.06
    Act Density 0.003%

    No Known Activations