INDEX
Explanations
phrases related to criticizing or mocking others
punctuation and sentence endings
New Auto-Interp
Negative Logits
inherited
-0.86
moth
-0.81
occupancy
-0.79
abduct
-0.78
landsl
-0.76
involuntary
-0.76
envis
-0.76
corrosion
-0.76
strugg
-0.75
glim
-0.75
POSITIVE LOGITS
Apparently
1.29
Turns
1.29
Needless
1.29
Specifically
1.27
Afterwards
1.22
Naturally
1.21
Seriously
1.21
Thankfully
1.21
Additionally
1.20
Notably
1.19
Activations Density 0.470%