INDEX
    Explanations

    critical line and controls

    New Auto-Interp
    Negative Logits
     apathy
    0.47
    poetrylovers
    0.47
     synthesize
    0.47
     warmest
    0.46
     IN
    0.46
     personnelle
    0.45
    Synt
    0.45
    art
    0.45
    פת
    0.45
     hottest
    0.44
    POSITIVE LOGITS
    <0xF3>
    0.55
    ীম
    0.55
    0.54
    0.54
    í
    0.52
    0.51
    0.51
    usz
    0.51
    0.50
    0.50
    Act Density 0.001%

    No Known Activations