INDEX
Explanations
profanity and expletives
expressions of frustration or disbelief
New Auto-Interp
Negative Logits
Joy
-0.75
atures
-0.67
FN
-0.66
obser
-0.65
iva
-0.63
Mos
-0.62
RNA
-0.61
omore
-0.61
Mesh
-0.61
izont
-0.61
POSITIVE LOGITS
happened
1.02
happens
0.79
else
0.79
transpired
0.79
ARE
0.77
ensued
0.74
THEY
0.72
!?"
0.72
chu
0.72
?!
0.71
Activations Density 0.059%