INDEX
    Explanations

    academic citations and bibliographies

    New Auto-Interp
    Negative Logits
     all
    -1.13
    どれも
    -1.06
    どれ
    -1.06
    全て
    -1.04
    みんなで
    -1.01
    すべて
    -0.98
     semuanya
    -0.98
    三种
    -0.95
     tất
    -0.94
     כול
    -0.90
    POSITIVE LOGITS
     both
    3.11
    Both
    2.58
     BOTH
    2.52
    both
    2.41
     Both
    2.33
     beide
    2.33
     beiden
    2.31
     entrambi
    2.27
     ambos
    2.23
    BOTH
    2.23
    Act Density 0.009%

    No Known Activations