INDEX
Explanations
pronouns and questions
sentences expressing opinions or judgments about actions and events
New Auto-Interp
Negative Logits
NAT
-0.63
Balt
-0.61
�
-0.60
imet
-0.58
),"
-0.58
�
-0.57
´
-0.57
)",
-0.56
ogenic
-0.56
����
-0.56
POSITIVE LOGITS
goddamn
0.82
irony
0.75
godd
0.71
dunno
0.70
hilar
0.70
Stupid
0.69
sucker
0.63
sane
0.63
damned
0.63
Thing
0.61
Activations Density 1.193%