INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     forgiving
    0.43
    prop
    0.42
    ốc
    0.41
    Prop
    0.40
    c
    0.39
     Fredrik
    0.39
    0.38
    fk
    0.38
    lossen
    0.38
     s
    0.38
    POSITIVE LOGITS
    Telephone
    0.61
     alguien
    0.61
     नए
    0.60
     Teilnehmer
    0.60
     telefono
    0.57
    0.57
     nuevo
    0.56
    0.56
    Telefono
    0.56
    0.55
    Act Density 0.005%

    No Known Activations