INDEX
    Explanations

    aspects related to emotional responses to social dynamics

    New Auto-Interp
    Negative Logits
     autorytatywna
    -1.09
     myſelf
    -1.08
     Efq
    -1.00
     itſelf
    -0.99
    aarrggbb
    -0.94
    ſelf
    -0.92
    OGND
    -0.90
    ſelves
    -0.90
     houſe
    -0.89
     themſelves
    -0.88
    POSITIVE LOGITS
    .
    0.56
    ,
    0.50
    ~
    0.47
    [
    0.45
     lost
    0.44
    zeug
    0.44
    ias
    0.42
    ...
    0.42
    0.42
     [
    0.41
    Act Density 0.254%

    No Known Activations