INDEX
    Explanations

    direct speech, quotations, and direct questions

    dialogue and statements made by characters

    New Auto-Interp
    Negative Logits
    £ı
    -0.85
    malink
    -0.69
    agos
    -0.65
    atures
    -0.62
    å§«
    -0.61
     wildfires
    -0.61
    "]=>
    -0.61
    unity
    -0.61
    utton
    -0.60
    ãĥĩ
    -0.59
    POSITIVE LOGITS
     hello
    0.91
     aloud
    0.77
     politely
    0.76
     loudly
    0.74
    dayName
    0.71
    yip
    0.71
     goodbye
    0.71
     inviting
    0.68
     dissatisf
    0.67
     angrily
    0.66
    Act Density 0.352%

    No Known Activations