INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    indeki
    -0.07
     عبار
    -0.07
     simplest
    -0.06
     Meyer
    -0.06
    -0.06
    GPL
    -0.06
    -0.06
     آینده
    -0.06
    =index
    -0.06
    .member
    -0.06
    POSITIVE LOGITS
    Guard
    0.15
    guard
    0.14
    _guard
    0.12
     guard
    0.11
     GU
    0.10
    guards
    0.10
     Guardian
    0.10
     Guard
    0.10
    .guard
    0.10
     Guardians
    0.10
    Act Density 0.005%

    No Known Activations