INDEX
    Explanations

    phrases that indicate introspection and self-discovery

    Follows "find" or "found"

    New Auto-Interp
    Negative Logits
     допомогти
    -0.51
     departe
    -0.51
    Pfalz
    -0.49
     indisponible
    -0.48
    bewerken
    -0.48
    }:${
    -0.47
    ッキリ
    -0.47
     دقیق
    -0.46
    -0.45
    skjaer
    -0.45
    POSITIVE LOGITS
    BeginInit
    0.77
     himself
    0.70
     themselves
    0.65
     herself
    0.64
    styleType
    0.62
     yourself
    0.61
     transfieras
    0.61
     myself
    0.60
     تضيفلها
    0.59
     invokingState
    0.58
    Act Density 0.091%

    No Known Activations