INDEX
    Explanations

    Columbus or bus

    New Auto-Interp
    Negative Logits
    015
    -0.07
    etyl
    -0.07
    te
    -0.06
     Fran
    -0.06
     الخاص
    -0.06
    想要
    -0.06
     excluding
    -0.06
     japon
    -0.06
     syn
    -0.06
     naughty
    -0.06
    POSITIVE LOGITS
     Columbus
    0.17
     Colum
    0.10
     Colombian
    0.09
     Colombia
    0.09
     Colomb
    0.08
     colum
    0.08
    .Circle
    0.07
     Oliveira
    0.07
     میدان
    0.07
     Cougar
    0.07
    Act Density 0.005%

    No Known Activations