INDEX
    Explanations

    comparisons between different subjects, especially in terms of similarities and differences

    New Auto-Interp
    Negative Logits
    izon
    -0.16
    izont
    -0.15
    gain
    -0.14
    chet
    -0.14
    inch
    -0.14
    609
    -0.14
    ngle
    -0.14
    vet
    -0.14
    ebin
    -0.13
    indre
    -0.13
    POSITIVE LOGITS
     alike
    0.39
    缸åIJĮ
    0.33
     both
    0.32
    Both
    0.30
    both
    0.29
     Both
    0.29
    ä¸Ģæł·
    0.28
     identical
    0.28
     BOTH
    0.27
     similar
    0.26
    Act Density 0.170%

    No Known Activations