INDEX
    Explanations

    references to feelings of compulsion or inexplicable motivations

    New Auto-Interp
    Negative Logits
     Infórmanos
    -0.48
    portál
    -0.46
     Numerade
    -0.45
    sumowanie
    -0.45
    Personensuche
    -0.44
     gynhyrchwyd
    -0.44
    点此举报
    -0.44
     @"/
    -0.44
     ویکی‌آمباردا
    -0.44
    -0.44
    POSITIVE LOGITS
    Somehow
    0.65
     Somehow
    0.64
     somehow
    0.56
     weirdly
    0.53
    weird
    0.53
     strangely
    0.52
     weird
    0.49
     mysteriously
    0.49
     oddly
    0.47
     Weird
    0.45
    Act Density 0.014%

    No Known Activations