INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _KIND
    -0.08
     Egypt
    -0.07
     elites
    -0.07
    -0.07
    ALI
    -0.07
     Choosing
    -0.06
     กร
    -0.06
     ориг
    -0.06
     STEP
    -0.06
     Curry
    -0.06
    POSITIVE LOGITS
     Intr
    0.13
     intr
    0.12
     intra
    0.10
    intr
    0.09
    _intr
    0.09
     laut
    0.07
    _INTR
    0.07
     intricate
    0.07
    _HAVE
    0.07
     intric
    0.07
    Act Density 0.006%

    No Known Activations