INDEX
    Explanations

    terms related to exploitation and coercive relationships

    New Auto-Interp
    Negative Logits
    Rüyada
    -0.46
     zdan
    -0.41
    -0.39
     coinciden
    -0.38
     вида
    -0.38
     Niemand
    -0.37
     soal
    -0.37
    OrNil
    -0.37
     mimo
    -0.37
    sprechend
    -0.36
    POSITIVE LOGITS
     ſta
    0.60
    ſelf
    0.59
     pinulongan
    0.58
    󠁬
    0.58
     Anſ
    0.56
     vrijwilli
    0.55
     ſch
    0.55
     hyö
    0.54
     raiſ
    0.53
    ſtra
    0.52
    Act Density 0.953%

    No Known Activations