INDEX
    Explanations

    expressions of personal opinion and social commentary

    New Auto-Interp
    Negative Logits
    ivar
    -0.15
    utow
    -0.15
    ладÑĥ
    -0.14
    reu
    -0.14
    reon
    -0.13
    hus
    -0.13
    iesel
    -0.13
    lea
    -0.13
    vsp
    -0.13
    dea
    -0.13
    POSITIVE LOGITS
     fine
    0.35
    fine
    0.32
     Fine
    0.27
    Fine
    0.27
     tough
    0.25
     screw
    0.24
    FINE
    0.23
     deal
    0.23
     Tough
    0.21
     go
    0.20
    Act Density 0.236%

    No Known Activations