INDEX
Explanations
expressions of surprise or emphasis
New Auto-Interp
Negative Logits
myſelf
-1.10
itſelf
-0.98
Efq
-0.95
useRouter
-0.94
houſe
-0.89
themſelves
-0.87
himſelf
-0.85
Houſe
-0.85
ſeveral
-0.78
againſt
-0.78
POSITIVE LOGITS
Oh
1.19
Oh
1.11
oh
1.00
oh
0.99
OH
0.92
OH
0.83
Ohh
0.81
sweet
0.78
็จ
0.76
Ohhhh
0.76
Activations Density 0.061%