INDEX
    Explanations

    magical or mythical elements in discussions of concepts or items

    New Auto-Interp
    Negative Logits
     itself
    -0.76
     is
    -0.70
     was
    -0.67
     kuris
    -0.66
     которому
    -0.66
    itself
    -0.66
     its
    -0.64
     Its
    -0.61
    tagHelper
    -0.59
     himself
    -0.58
    POSITIVE LOGITS
     themselves
    1.52
    themselves
    1.30
     cherchés
    1.15
     are
    1.03
     were
    0.90
     jotka
    0.87
     eds
    0.83
     themſelves
    0.83
     אלה
    0.80
     those
    0.79
    Act Density 3.916%

    No Known Activations