INDEX
    Explanations

    common articles and determiners in text

    New Auto-Interp
    Negative Logits
    ÙħÙĪØ¯
    -0.15
    olland
    -0.15
    μη
    -0.15
    isman
    -0.15
    adder
    -0.14
    addin
    -0.14
     Hin
    -0.13
    UDIO
    -0.13
     tal
    -0.13
    tered
    -0.13
    POSITIVE LOGITS
     particular
    0.42
     given
    0.41
    given
    0.34
     PARTICULAR
    0.31
     Given
    0.27
    _given
    0.27
    Given
    0.26
     GIVEN
    0.26
     particul
    0.25
    icular
    0.22
    Act Density 0.215%

    No Known Activations