INDEX
    Explanations

    instances of temporal phrases or events related to actions

    New Auto-Interp
    Negative Logits
     myſelf
    -0.96
     itſelf
    -0.91
    )");
    
    -0.90
    RegistryLite
    -0.81
    ſelves
    -0.79
     Monfieur
    -0.78
    //{
    
    -0.77
     ―――――
    -0.77
    tagHelperRunner
    -0.74
     themſelves
    -0.74
    POSITIVE LOGITS
    Tembelea
    0.52
     Ab
    0.46
    <eos>
    0.44
    icin
    0.41
    paccio
    0.40
    airo
    0.39
    0.39
     Word
    0.38
     Drawer
    0.38
    كتشاف
    0.38
    Act Density 0.068%

    No Known Activations