INDEX
    Explanations

    introduces clauses or explanations

    New Auto-Interp
    Negative Logits
     solved
    0.47
    0
    0.45
     movies
    0.44
     pancakes
    0.43
     lasagna
    0.43
    ю
    0.42
    ations
    0.42
     varied
    0.41
     infinito
    0.41
    हारिक
    0.41
    POSITIVE LOGITS
     Whereas
    0.60
     Firstly
    0.60
     Within
    0.57
     Referring
    0.56
     Whenever
    0.55
     Namely
    0.55
     While
    0.54
     When
    0.54
     Upon
    0.53
     In
    0.52
    Act Density 0.005%

    No Known Activations