INDEX
    Explanations

    numerical values associated with specific categories or ratings

    New Auto-Interp
    Negative Logits
    dür
    -0.16
    ÑģÑĮ
    -0.16
    ource
    -0.16
    vell
    -0.15
    inal
    -0.15
    onds
    -0.15
    idebar
    -0.15
    ano
    -0.15
    oretical
    -0.15
    drawing
    -0.14
    POSITIVE LOGITS
    年代
    0.26
    s
    0.25
    something
    0.21
     odd
    0.17
    ish
    0.17
    -Ñħ
    0.17
    th
    0.16
    Something
    0.16
    ahlen
    0.16
    625
    0.16
    Act Density 0.260%

    No Known Activations