INDEX
Explanations
phrases with a sense of urgency and direct instructions
expressions of personal knowledge or persuasive appeals
New Auto-Interp
Negative Logits
Plaint
-0.71
Canaver
-0.65
RESULTS
-0.65
aucus
-0.63
Defendants
-0.62
:[
-0.60
'[
-0.60
OSP
-0.60
IPM
-0.59
ropri
-0.58
POSITIVE LOGITS
kidding
0.81
ya
0.79
blah
0.76
!.
0.73
cha
0.73
!).
0.72
?).
0.71
math
0.69
maths
0.67
gotta
0.67
Activations Density 0.545%