INDEX
    Explanations

    compound words and phrases

    New Auto-Interp
    Negative Logits
     the
    1.27
     a
    1.26
     Jeg
    1.22
     og
    1.21
     and
    1.18
     or
    1.14
     (
    1.14
     all
    1.08
    .
    1.07
     just
    1.07
    POSITIVE LOGITS
    1.53
    1.46
    此外
    1.43
    色的
    1.41
    力和
    1.41
    1.40
    1.39
    线
    1.37
    1.36
    1.33
    Act Density 0.038%

    No Known Activations