INDEX
    Explanations

    punctuation and specific formatting related to quotations or cited text

    New Auto-Interp
    Negative Logits
    ipo
    -0.14
    eldom
    -0.13
    osaur
    -0.13
    LEV
    -0.13
     множе
    -0.12
    urette
    -0.12
    ÂĢÂĻ
    -0.12
    olics
    -0.12
    å¸Į
    -0.12
    .MixedReality
    -0.12
    POSITIVE LOGITS
    Âł
    0.49
    	
    0.25
    Âłmiles
    0.21
     _
    0.21
    Âł↵
    0.19
    ÂłkW
    0.19
    Âłmi
    0.18
    Âłmph
    0.17
    ³³
    0.17
    Âłb
    0.17
    Act Density 0.018%

    No Known Activations