INDEX
    Explanations

    words indicating negative sentiment or claims

    New Auto-Interp
    Negative Logits
    Strategies
    0.98
    Folders
    0.90
     Når
    0.90
    strategies
    0.89
    Different
    0.88
    .​​
    0.87
    Matrices
    0.86
    0.86
     multiplicación
    0.85
    algèbre
    0.84
    POSITIVE LOGITS
     huge
    1.36
     pissed
    1.35
     HUGE
    1.34
     suspiciously
    1.32
     dubious
    1.30
     THEIR
    1.25
     basically
    1.24
     shitty
    1.24
     worthless
    1.23
     dodgy
    1.23
    Act Density 0.035%

    No Known Activations