INDEX
    Explanations

    references to workshop quality and organization

    New Auto-Interp
    Negative Logits
    etter
    -0.16
    ofilm
    -0.15
    asaki
    -0.15
    @dynamic
    -0.15
    etten
    -0.15
    923
    -0.14
    eren
    -0.14
    dera
    -0.14
     practition
    -0.14
    hots
    -0.14
    POSITIVE LOGITS
     Indian
    0.22
     Bombay
    0.20
    BT
    0.20
     BT
    0.19
    EE
    0.19
    Indian
    0.18
     Mad
    0.18
    awah
    0.18
     Pow
    0.17
     Kan
    0.17
    Act Density 0.020%

    No Known Activations