INDEX
Explanations
punctuation marks, particularly colons
icons or emoticons used to express emotions or reactions
New Auto-Interp
Negative Logits
ifying
-0.79
utive
-0.74
izabeth
-0.71
idad
-0.70
conversions
-0.69
ified
-0.67
heed
-0.67
ilingual
-0.65
ilitation
-0.64
ibles
-0.64
POSITIVE LOGITS
::::::::
1.68
::::
1.41
;;;;;;;;;;;;
0.87
;;;;;;;;
0.87
-(
0.86
,,,,
0.85
,,,,,,,,
0.83
cause
0.70
hover
0.68
;;;;
0.68
Activations Density 0.032%