INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ن
    0.76
    ר
    0.72
    ر
    0.70
    ك
    0.68
    0.68
    <0x80>
    0.67
    ת
    0.64
    عند
    0.64
    <0x81>
    0.62
    но
    0.61
    POSITIVE LOGITS
    Mill
    0.62
    mill
    0.59
    Half
    0.57
    Theta
    0.57
    Benchmark
    0.57
    Pir
    0.56
    SIL
    0.55
    BW
    0.55
    chedules
    0.55
    Sac
    0.54
    Act Density 0.010%

    No Known Activations