INDEX
    Explanations

    articles indicating a positive or appreciative stance

    New Auto-Interp
    Negative Logits
     Tal
    -0.17
     tal
    -0.17
    wan
    -0.15
    ncia
    -0.15
    alam
    -0.15
    tal
    -0.14
     Wonder
    -0.14
    oot
    -0.14
     tale
    -0.14
    itol
    -0.14
    POSITIVE LOGITS
    pike
    0.16
    -addons
    0.16
    ÛĮزÛĮ
    0.16
    ëļ
    0.15
    _TP
    0.15
    áÄį
    0.14
    erence
    0.14
    _TUN
    0.14
    ampo
    0.14
    "value
    0.14
    Act Density 0.022%

    No Known Activations