INDEX
    Explanations

    electric, electrical, electricity

    New Auto-Interp
    Negative Logits
    ين
    1.15
    ای
    1.05
    1.05
    0.98
    жи
    0.96
    au
    0.95
     auxqu
    0.95
    การ
    0.94
    م
    0.94
    ใน
    0.94
    POSITIVE LOGITS
    ↵↵
    1.00
    0
    0.98
     on
    0.93
     electric
    0.85
     it
    0.84
     electro
    0.84
    sen
    0.83
     electricity
    0.82
    this
    0.82
     not
    0.81
    Act Density 0.044%

    No Known Activations