INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     nostalgic
    -0.08
    อต
    -0.07
     thiên
    -0.06
    *$
    -0.06
    .init
    -0.06
    ţ
    -0.06
    POSIT
    -0.06
    _Native
    -0.06
    121
    -0.06
     protective
    -0.06
    POSITIVE LOGITS
     behalf
    0.11
    0.07
     नय
    0.06
    :image
    0.06
     Ahead
    0.06
    <b
    0.06
     chauff
    0.06
    фров
    0.06
     agents
    0.06
     materially
    0.06
    Act Density 0.004%

    No Known Activations