INDEX
    Explanations

    instances of people expressing opinions or making claims

    New Auto-Interp
    Negative Logits
    sburg
    -0.22
     yesterday
    -0.21
    ges
    -0.21
     earlier
    -0.19
     бÑĭ
    -0.19
    s
    -0.18
    ries
    -0.17
    ly
    -0.17
    sb
    -0.17
    elik
    -0.16
    POSITIVE LOGITS
    äºĨä¸Ģ
    0.18
     äºĨ
    0.18
    (ed
    0.17
    asion
    0.16
    oron
    0.16
    able
    0.15
    indr
    0.15
    erved
    0.15
    ouve
    0.15
    _typeof
    0.14
    Act Density 0.053%

    No Known Activations