INDEX
    Explanations

    names of people or places with specific formatting and symbols

    the character or symbol "Ŀ" in the text

    New Auto-Interp
    Negative Logits
     referen
    -0.80
    awaru
    -0.78
     exhib
    -0.68
     unim
    -0.68
     memos
    -0.67
     wra
    -0.67
    chnology
    -0.67
     unborn
    -0.66
     ultras
    -0.66
     autobiography
    -0.65
    POSITIVE LOGITS
    ļ
    0.89
    Ŀ
    0.87
    º
    0.86
    bryce
    0.85
    ÏĦ
    0.83
    SourceFile
    0.83
    ¼
    0.81
    ï¸ı
    0.80
    Ĺ
    0.80
    taboola
    0.79
    Act Density 0.138%

    No Known Activations