INDEX
    Explanations

    braces and opening curly brackets

    New Auto-Interp
    Negative Logits
    achim
    -0.65
    ΟΣ
    -0.61
    wego
    -0.60
    ÁG
    -0.59
    iels
    -0.59
    isburg
    -0.59
    jeev
    -0.58
    -0.58
    ława
    -0.58
    tellt
    -0.57
    POSITIVE LOGITS
    __':
    
    1.49
    __':
    1.47
    __":
    1.47
    __":
    
    1.43
    ))){
    1.14
    --){
    1.12
     الحره
    1.10
    "])){
    1.09
    '])){
    1.05
    ])));
    1.01
    Act Density 0.055%

    No Known Activations