INDEX
    Explanations

    terms related to adjectives and descriptions in a document

    New Auto-Interp
    Negative Logits
    inda
    -0.17
    571
    -0.16
     Slip
    -0.15
     плоÑī
    -0.15
    itud
    -0.15
    iktig
    -0.14
    ãĥ¬ãĥ³
    -0.14
    ãĤŃãĥ¥
    -0.14
    thern
    -0.14
    .idea
    -0.14
    POSITIVE LOGITS
     колиÑĩеÑģÑĤво
    0.19
     вÑĢемÑı
    0.19
     колиÑĩе
    0.18
     знаÑĩение
    0.18
    Ïīμα
    0.18
    atre
    0.17
     название
    0.17
    ÑģÑĤво
    0.17
     ÑĤеÑĩение
    0.17
     лиÑĨо
    0.17
    Act Density 0.031%

    No Known Activations