INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Ansi
    -0.16
     Franti
    -0.15
     potentials
    -0.15
    ivalent
    -0.15
    agger
    -0.14
    lander
    -0.14
    168
    -0.14
    hydrate
    -0.14
    ocrat
    -0.14
     Slots
    -0.14
    POSITIVE LOGITS
    泡
    0.17
    ılıç
    0.15
    aland
    0.15
    ewan
    0.14
    phins
    0.14
     Hobby
    0.14
    мини
    0.14
    redients
    0.14
    venir
    0.14
    erture
    0.13
    Act Density 0.002%

    No Known Activations