INDEX
Explanations
assertive statements or instructions with a somewhat forceful tone
phrases indicating reliance or dependency on external factors
New Auto-Interp
Negative Logits
Minor
-0.77
SPONSORED
-0.75
toggle
-0.71
SEE
-0.67
ivari
-0.67
Gender
-0.67
alter
-0.66
Includes
-0.65
Interview
-0.64
asses
-0.63
POSITIVE LOGITS
Promise
0.78
promise
0.77
already
0.76
certainly
0.75
sooner
0.73
'll
0.73
surely
0.72
scams
0.68
ngth
0.68
otherwise
0.66
Activations Density 0.458%