INDEX
    Explanations

    references to relationships and their complexities

    New Auto-Interp
    Negative Logits
     BOTH
    -0.25
     both
    -0.24
    both
    -0.24
     Both
    -0.21
    Both
    -0.21
     beide
    -0.16
    _BOTH
    -0.16
     ambos
    -0.16
     både
    -0.15
     ALWAYS
    -0.14
    POSITIVE LOGITS
     even
    0.17
    even
    0.15
     thing
    0.15
     something
    0.15
     soon
    0.14
     incluso
    0.14
    689
    0.14
     sogar
    0.14
    ancel
    0.14
    something
    0.14
    Act Density 0.078%

    No Known Activations