INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Teste
    -0.80
     nazis
    -0.79
    -0.78
     TROP
    -0.77
     Forst
    -0.77
     stocker
    -0.74
     snip
    -0.74
     Glossary
    -0.73
     crimp
    -0.73
    outdir
    -0.73
    POSITIVE LOGITS
     描
    0.85
    agal
    0.84
    ιώ
    0.83
     both
    0.81
    kennen
    0.80
    tivation
    0.77
     them
    0.76
    richlet
    0.74
    NSLog
    0.74
     they
    0.74
    Act Density 0.016%

    No Known Activations