INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    70
    -0.07
    ング
    -0.07
     кредит
    -0.07
     horses
    -0.07
     diversos
    -0.06
    ch
    -0.06
     Defence
    -0.06
    967
    -0.06
    20
    -0.06
    buch
    -0.06
    POSITIVE LOGITS
    (@"
    0.10
    :@"
    0.09
     @"
    0.07
    ",@"
    0.07
    =@"
    0.07
    AA
    0.07
    .Ret
    0.07
     Compilation
    0.07
     @"";↵
    0.07
     Ezek
    0.07
    Act Density 0.002%

    No Known Activations