INDEX
    Explanations

    phrases related to actions or statements made by specific individuals

    statements expressing opinions or claims about individuals

    New Auto-Interp
    Negative Logits
    .(
    -0.81
    .</
    -0.80
    .*
    -0.79
    }.
    -0.79
    .<
    -0.78
    .}
    -0.75
    ãĢĤ
    -0.71
    .-
    -0.70
    >.
    -0.69
    :-
    -0.65
    POSITIVE LOGITS
    ,"
    0.98
    xiety
    0.93
    %"
    0.86
    ffield
    0.85
    "),
    0.84
    ,'"
    0.83
     [
    0.82
     ain
    0.82
    zbollah
    0.82
    initely
    0.82
    Act Density 0.334%

    No Known Activations