INDEX
    Explanations

    understanding certain languages

    New Auto-Interp
    Negative Logits
     almost
    -0.13
     Almost
    -0.11
    Almost
    -0.11
     both
    -0.11
     Various
    -0.10
     mostly
    -0.10
     rather
    -0.09
     både
    -0.09
    igar
    -0.09
     casi
    -0.09
    POSITIVE LOGITS
     certain
    0.40
     certains
    0.29
     Certain
    0.29
    Certain
    0.27
    æŁIJ
    0.24
     bestimm
    0.22
    ertain
    0.21
     some
    0.19
    æľīäºĽ
    0.18
     older
    0.17
    Act Density 0.100%

    No Known Activations