INDEX
Explanations
exclamations and expressions of excitement in the text
New Auto-Interp
Negative Logits
gdx
-0.82
ede
-0.82
aure
-0.76
ation
-0.75
Rés
-0.75
Rump
-0.74
"):
-0.73
[`
-0.69
Carter
-0.69
böz
-0.67
POSITIVE LOGITS
%!
1.78
?!?
1.73
?!?!
1.64
!
1.55
!
1.55
!!!!!!
1.53
!!!!!!!
1.52
!!!!!!!!!!
1.44
?!
1.43
!"
1.42
Activations Density 0.091%