INDEX
    Explanations

    proper nouns or specific names of entities

    phrases that indicate something is widely recognized or identified

    New Auto-Interp
    Negative Logits
    cel
    -0.84
    otom
    -0.80
    prus
    -0.78
    otos
    -0.76
    oton
    -0.76
    plet
    -0.74
    ŃĶ
    -0.73
    adies
    -0.73
    lot
    -0.73
    ingers
    -0.72
    POSITIVE LOGITS
     known
    0.97
     Known
    0.96
    NESS
    0.87
    KNOWN
    0.84
    ledged
    0.78
    comings
    0.76
    Known
    0.75
    =]
    0.75
    л
    0.74
    known
    0.72
    Act Density 0.025%

    No Known Activations