INDEX
    Explanations

    conjunctions and transitional phrases

    Comes before alternatives or options

    alternative recommendations and warnings

    New Auto-Interp
    Negative Logits
    )');
    -0.65
    ?";
    -0.62
    -0.61
    ')";
    -0.58
    ]));
    
    -0.57
    "");
    -0.57
    ]<<"
    -0.57
    eaways
    -0.56
    ?");
    -0.56
    /');
    -0.55
    POSITIVE LOGITS
     beware
    1.09
     you
    1.03
     Beware
    0.95
    おすすめです
    0.95
    Beware
    0.93
    you
    0.90
     You
    0.89
    Alternatively
    0.87
     Alternatively
    0.87
    オススメです
    0.86
    Act Density 0.217%

    No Known Activations