INDEX
    Explanations

    forward slash and "var"

    New Auto-Interp
    Negative Logits
     oui
    -0.08
     dışı
    -0.07
     полот
    -0.07
     Dirk
    -0.06
    Career
    -0.06
     ві
    -0.06
    _AND
    -0.06
     Gospel
    -0.06
    ΥΣ
    -0.06
    ��
    -0.06
    POSITIVE LOGITS
     recreate
    0.08
     Oktober
    0.06
    allet
    0.06
     backpack
    0.06
    Optimizer
    0.06
    ↵↵↵↵↵↵↵↵↵
    0.06
    ↵
    ↵
    ↵
    ↵
    0.06
    ATES
    0.06
     Catalyst
    0.06
    0.06
    Act Density 0.001%

    No Known Activations