INDEX
    Explanations

    phrases related to realization and acknowledgment of truths or facts

    New Auto-Interp
    Negative Logits
     ویکی‌پدی
    -0.54
     Италијани
    -0.53
     них
    -0.48
    protoimpl
    -0.47
    postsleuth
    -0.45
    SBATCH
    -0.45
     الرياضيه
    -0.44
    ToTensor
    -0.43
    OCCURRED
    -0.43
    AddTagHelper
    -0.42
    POSITIVE LOGITS
     they
    1.33
     there
    1.28
     we
    1.25
     she
    1.06
     it
    0.96
     he
    0.89
     theres
    0.86
    there
    0.80
     you
    0.76
     theyre
    0.75
    Act Density 0.620%

    No Known Activations