INDEX
Explanations
phrases indicating a directive or instruction
statements regarding personal rights or autonomy
New Auto-Interp
Negative Logits
guiActiveUnfocused
-0.79
ĺħ
-0.69
atro
-0.66
ascal
-0.65
ools
-0.65
oward
-0.64
assian
-0.61
ries
-0.61
Chair
-0.61
chaired
-0.59
POSITIVE LOGITS
abouts
0.76
NESS
0.72
uni
0.72
Pyr
0.68
optional
0.67
ighting
0.66
Eps
0.65
pai
0.64
nz
0.64
ski
0.60
Activations Density 0.000%