INDEX
    Explanations

    sections related to scientific studies and findings

    New Auto-Interp
    Negative Logits
     Ceux
    -0.64
     anything
    -0.61
    Anything
    -0.60
    每一个
    -0.60
     every
    -0.59
    OGND
    -0.59
     everything
    -0.58
    Diwedd
    -0.58
     оригіналу
    -0.58
     meeste
    -0.57
    POSITIVE LOGITS
     similarly
    1.03
     unrelated
    0.94
     than
    0.89
     equally
    0.88
     nearby
    0.85
     besides
    0.84
     igualmente
    0.84
     similar
    0.83
     serupa
    0.81
    nearby
    0.78
    Act Density 0.459%

    No Known Activations