INDEX
    Explanations

    phrases that indicate causation or justification, often related to actions taken by individuals or groups

    for your, for her, for its, for their

    New Auto-Interp
    Negative Logits
    __':
    -0.57
    __":
    
    -0.56
    __':
    
    -0.52
    клопе
    -0.49
    __":
    -0.48
    Enllaços
    -0.45
     caller
    -0.44
     linkovi
    -0.43
     myſelf
    -0.42
    IVersion
    -0.42
    POSITIVE LOGITS
     خاطر
    0.46
     kasarigan
    0.41
    formin
    0.40
     bedoeld
    0.38
     ricord
    0.38
    AutoScale
    0.38
     geha
    0.38
     trời
    0.38
     ervoor
    0.38
     schuldig
    0.37
    Act Density 0.123%

    No Known Activations