INDEX
    Explanations

    instances of user interactions and dialogues within forum threads

    New Auto-Interp
    Negative Logits
    pta
    -0.16
    anks
    -0.15
    otel
    -0.15
    okus
    -0.15
     Mansion
    -0.15
    intl
    -0.14
    issen
    -0.14
    ese
    -0.14
     Luo
    -0.14
    ibal
    -0.14
    POSITIVE LOGITS
     wrote
    0.27
     »
    0.19
    å¸ĸ
    0.18
    »
    0.16
     Wed
    0.16
    [color
    0.15
     Re
    0.15
    ujet
    0.15
    Post
    0.14
     Post
    0.14
    Act Density 0.006%

    No Known Activations