INDEX
    Explanations

    same concept across contexts

    New Auto-Interp
    Negative Logits
    ciendo
    0.45
    attup
    0.44
    COD
    0.42
    Experts
    0.42
    جهات
    0.42
     праців
    0.42
    শ্য
    0.41
    的具体
    0.41
    ందని
    0.40
     professionals
    0.40
    POSITIVE LOGITS
     same
    0.56
     Same
    0.52
    same
    0.51
    Same
    0.48
     misma
    0.45
     selben
    0.44
     mismo
    0.44
     같은
    0.44
     Ibid
    0.43
     SAME
    0.42
    Act Density 0.000%

    No Known Activations