INDEX
Explanations
intense emotional expressions and moments of exclamation
Exclamations and emphasized punctuation
strong emphasis and shouting
New Auto-Interp
Negative Logits
featureID
-0.81
}}$}
-0.79
.[/
-0.77
незавершена
-0.75
.},
-0.75
."]
-0.75
]`
-0.73
--}}
-0.72
.")
-0.70
>--}}
-0.69
POSITIVE LOGITS
!
1.96
!!
1.68
!!!
1.67
!
1.55
!!!!
1.51
!)
1.49
!"
1.48
!
1.46
!!!!!
1.44
!”
1.43
Activations Density 0.201%