INDEX
    Explanations

    disclaimers and guidelines

    New Auto-Interp
    Negative Logits
     y
    0.55
     empire
    0.51
     lo
    0.51
     n
    0.50
     indigo
    0.49
     companionship
    0.49
     o
    0.48
     bring
    0.48
     beech
    0.48
     z
    0.48
    POSITIVE LOGITS
    К
    0.59
    л
    0.55
    т
    0.52
    Guidelines
    0.51
    Kết
    0.50
    Deux
    0.49
    compliant
    0.48
    برای
    0.48
    те
    0.48
    विरो
    0.48
    Act Density 0.157%

    No Known Activations