INDEX
    Explanations

    phrases related to causality or reason

    phrases that introduce explanations or justifications

    New Auto-Interp
    Negative Logits
    Contact
    -0.88
    LAB
    -0.83
    marine
    -0.77
    zona
    -0.75
    contact
    -0.71
    Movie
    -0.70
    Minimum
    -0.68
    BuyableInstoreAndOnline
    -0.67
    Dro
    -0.66
    Jr
    -0.64
    POSITIVE LOGITS
     instance
    1.24
    gotten
    1.22
    bidden
    1.19
     example
    1.15
     centuries
    1.09
     millennia
    0.97
    cing
    0.95
     reasons
    0.91
     decades
    0.91
    cible
    0.90
    Act Density 0.096%

    No Known Activations