INDEX
    Explanations

    phrases related to communication and the sharing of information

    New Auto-Interp
    Negative Logits
    jan
    -0.15
    ialis
    -0.15
     terra
    -0.15
    dex
    -0.14
    misc
    -0.14
    bsub
    -0.14
     excer
    -0.13
     elev
    -0.13
     sc
    -0.13
    LAS
    -0.13
    POSITIVE LOGITS
     explaining
    0.25
     Explain
    0.23
     explain
    0.23
     explanation
    0.19
     explained
    0.18
    explain
    0.18
     Explanation
    0.17
     explains
    0.17
     explanations
    0.17
    explained
    0.17
    Act Density 0.231%

    No Known Activations