INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    syn
    1.09
    aker
    1.05
     syn
    1.04
     gene
    1.03
     Syn
    1.01
    ogene
    1.00
     باك
    0.96
     gluten
    0.93
     Nantes
    0.93
     player
    0.93
    POSITIVE LOGITS
     $
    1.85
    $
    1.70
     `$
    1.51
     "$
    1.34
     '$
    1.31
    }$
    1.29
    $"
    1.28
     _$
    1.27
    >$
    1.27
     ($
    1.25
    Act Density 1.422%

    No Known Activations