INDEX
Explanations
curse words
informal expressions of strong emotion or opinion
New Auto-Interp
Negative Logits
telecommunications
-0.78
intra
-0.78
Proceedings
-0.77
Directorate
-0.75
exclusion
-0.74
advisory
-0.74
Partnership
-0.73
susceptibility
-0.73
Transition
-0.72
Provision
-0.71
POSITIVE LOGITS
aaaa
1.08
!!!
1.07
!!!!
1.06
!!!!!
1.06
eeee
1.04
haha
1.03
;)
1.03
oooo
1.03
fuck
1.02
laughs
1.01
Activations Density 0.679%