INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    in
    0.80
    tiene
    0.77
    table
    0.73
    daten
    0.73
    í
    0.73
    den
    0.69
    hall
    0.68
    als
    0.66
     автомобиль
    0.66
     It
    0.65
    POSITIVE LOGITS
    ב
    0.98
     conscience
    0.97
    ה
    0.92
    consc
    0.87
    ي
    0.83
    G
    0.82
    <0x80>
    0.82
    D
    0.82
    TS
    0.80
     consci
    0.77
    Act Density 0.001%

    No Known Activations