INDEX
    Explanations

    statements starting with "In fact" and "actually"

    statements that emphasize factual information

    New Auto-Interp
    Negative Logits
    GBT
    -0.56
    ggles
    -0.56
    eded
    -0.55
    Lastly
    -0.54
    rounder
    -0.53
    arthed
    -0.53
     Lastly
    -0.53
     Flavoring
    -0.53
    prus
    -0.52
    peat
    -0.52
    POSITIVE LOGITS
    ,
    0.98
    ,.
    0.78
    terday
    0.77
    ,...
    0.73
    .,
    0.70
    !,
    0.70
    rophe
    0.65
    oln
    0.64
    ,,
    0.64
    ,-
    0.62
    Act Density 0.056%

    No Known Activations