INDEX
    Explanations

    names of individuals or characters

    New Auto-Interp
    Negative Logits
    Geplaatst
    -0.69
    astéro
    -0.58
    Искәрмәләр
    -0.56
     Preferencias
    -0.55
    Distribuzione
    -0.53
     стаття
    -0.52
    insee
    -0.50
     invokingState
    -0.50
    Etimología
    -0.49
     António
    -0.49
    POSITIVE LOGITS
    extAlignment
    0.48
    Semitism
    0.46
     honor
    0.45
    ParallelGroup
    0.45
    0.42
    ukunft
    0.42
    さん
    0.41
    Honor
    0.41
     Honor
    0.41
     Britann
    0.40
    Act Density 0.147%

    No Known Activations