INDEX
    Explanations

    expressions of emotional states and personal qualities

    New Auto-Interp
    Negative Logits
     houſe
    -0.96
     Efq
    -0.92
     Theſe
    -0.88
     Monfieur
    -0.87
     fevere
    -0.86
    ſelf
    -0.85
     ſta
    -0.83
     ſtate
    -0.81
     laſt
    -0.81
     ſche
    -0.80
    POSITIVE LOGITS
     without
    0.69
    without
    0.56
     WITHOUT
    0.56
     vi
    0.53
     образом
    0.52
     Without
    0.50
    Without
    0.48
     tanpa
    0.48
     ohne
    0.48
    ,
    0.47
    Act Density 1.091%

    No Known Activations