INDEX
    Explanations

    phrases related to beliefs, descriptions, and assumptions about certain topics or entities

    phrases that express common beliefs or perceptions

    New Auto-Interp
    Negative Logits
    aleb
    -0.62
     Donkey
    -0.60
     disg
    -0.60
     Hungry
    -0.60
     Bravo
    -0.58
     Kl
    -0.58
    alos
    -0.57
     VG
    -0.57
     Competition
    -0.56
     Sierra
    -0.56
    POSITIVE LOGITS
    isSpecialOrderable
    0.78
    ت
    0.75
     pegged
    0.71
    س
    0.70
    hack
    0.68
    inet
    0.68
    ypes
    0.67
    idable
    0.66
    à¨
    0.66
     derog
    0.66
    Act Density 0.159%

    No Known Activations