INDEX
    Explanations

    phrases that indicate individual or collective participation in processes or systems

    New Auto-Interp
    Negative Logits
    sr
    -0.57
    <eos>
    -0.56
     .
    -0.56
    ilim
    -0.55
     “
    -0.54
     /
    -0.54
    ↵↵
    -0.52
     li
    -0.52
      
    -0.52
    ö
    -0.52
    POSITIVE LOGITS
     itſelf
    1.49
     each
    1.46
     EACH
    1.41
     Chaque
    1.39
    each
    1.38
    Chaque
    1.36
     Each
    1.36
    Each
    1.36
    EACH
    1.35
     Ogni
    1.35
    Act Density 0.354%

    No Known Activations