INDEX
    Explanations

    references to accountability and consequences for actions

    New Auto-Interp
    Negative Logits
     purpoſe
    -0.78
     pleaſure
    -0.75
     fieldNum
    -0.73
     myſelf
    -0.73
     Monfieur
    -0.72
     Diſ
    -0.72
     houſe
    -0.71
     Theſe
    -0.70
     ſtate
    -0.69
    ContentAsync
    -0.68
    POSITIVE LOGITS
    Vidite
    0.77
     незавершена
    0.73
    KURZBESCHREIBUNG
    0.60
    تقاوى
    0.59
     lack
    0.54
     vì
    0.52
    Geplaatst
    0.49
    StreetMap
    0.49
    0.48
     vermis
    0.46
    Act Density 0.378%

    No Known Activations