INDEX
    Explanations

    phrases that indicate examples, lists, or specific themes

    New Auto-Interp
    Negative Logits
    imos
    -0.15
    ="__
    -0.14
    istrovstvÃŃ
    -0.14
    anca
    -0.13
    alles
    -0.13
    yne
    -0.13
    енÑĤÑĥ
    -0.13
     looph
    -0.13
    unos
    -0.13
    lag
    -0.13
    POSITIVE LOGITS
     ways
    0.24
     among
    0.22
     many
    0.20
    among
    0.19
    poss
    0.18
     amongst
    0.17
     Among
    0.17
    many
    0.17
     Ways
    0.17
    -many
    0.17
    Act Density 0.058%

    No Known Activations