INDEX
Explanations
phrases indicating personal accountability and responsibility
New Auto-Interp
Negative Logits
ukan
-0.17
γÎŃν
-0.15
rico
-0.15
ilot
-0.14
oya
-0.14
arters
-0.14
/o
-0.14
localVar
-0.13
á»ijc
-0.13
oes
-0.13
POSITIVE LOGITS
ï¼ĮæĪĸ
0.50
OR
0.48
Alternatively
0.47
Or
0.46
Alternatively
0.42
or
0.41
.Or
0.37
æĪĸ
0.36
.OR
0.36
Or
0.35
Activations Density 0.444%