INDEX
    Explanations

    generating text from ids

    New Auto-Interp
    Negative Logits
     obič
    0.86
    нин
    0.82
     Dolomites
    0.82
    orges
    0.80
    shortcut
    0.78
     jongens
    0.77
    ScriptName
    0.77
    Stretch
    0.76
    <unused150>
    0.75
    0.74
    POSITIVE LOGITS
    ,
    1.18
     being
    1.11
    ،
    1.05
     ,
    1.03
     becoming
    0.99
    0.90
    ,,
    0.88
    0.87
     quality
    0.86
     itself
    0.86
    Act Density 0.001%

    No Known Activations