INDEX
    Explanations

    common words followed by specific continuations

    New Auto-Interp
    Negative Logits
     whitish
    0.19
    いずれ
    0.19
     фрук
    0.18
     drowsiness
    0.18
     tvar
    0.18
     substring
    0.18
     adjectives
    0.17
     physiologique
    0.17
     deceiving
    0.17
    ggbb
    0.17
    POSITIVE LOGITS
    4
    0.27
    G
    0.26
     and
    0.26
    5
    0.26
    H
    0.25
    P
    0.25
    7
    0.25
    1
    0.24
    6
    0.23
    Г
    0.23
    Act Density 4.193%

    No Known Activations