INDEX
    Explanations

    relationships and interactions among characters in social situations

    New Auto-Interp
    Negative Logits
    rea
    -0.15
    ãģĴ
    -0.15
    cmp
    -0.14
    NSS
    -0.14
    uj
    -0.14
    aut
    -0.14
    uci
    -0.14
    еÑĢж
    -0.14
     Reyes
    -0.13
    ila
    -0.13
    POSITIVE LOGITS
     instead
    0.23
    instead
    0.21
     rather
    0.20
     alone
    0.19
    alone
    0.19
    rather
    0.18
     Instead
    0.18
     Ital
    0.18
     Alone
    0.17
     Rather
    0.16
    Act Density 0.232%

    No Known Activations