INDEX
    Explanations

    phrases expressing requests or suggestions

    New Auto-Interp
    Negative Logits
     geleverd
    -0.55
     høj
    -0.55
     voeten
    -0.54
     jedin
    -0.52
     attestation
    -0.51
     emerges
    -0.50
     således
    -0.49
    InSection
    -0.49
    classnames
    -0.48
     Jefus
    -0.48
    POSITIVE LOGITS
     try
    0.99
     take
    0.87
     check
    0.85
     put
    0.82
     hit
    0.80
     use
    0.78
     let
    0.77
     spend
    0.77
    devamını
    0.76
     think
    0.75
    Act Density 0.189%

    No Known Activations