INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Settings
    0.55
     Viking
    0.54
     Gild
    0.52
     Dinas
    0.51
     Kraków
    0.51
     Gda
    0.50
     Settings
    0.50
    0.48
     Saras
    0.47
     Mods
    0.47
    POSITIVE LOGITS
    plaat
    0.50
    0.47
    0.46
    itrile
    0.45
     tử
    0.45
    0.45
    ون
    0.45
    0.44
    теле
    0.44
     playfully
    0.44
    Act Density 0.000%

    No Known Activations