INDEX
Explanations
expressions of surprise or realization
"Ah" or "Ahh" variants
New Auto-Interp
Negative Logits
vett
-0.50
Virginie
-0.45
Perse
-0.45
Mette
-0.44
cupine
-0.43
ccb
-0.43
mbangan
-0.42
føl
-0.42
veral
-0.42
ellen
-0.42
POSITIVE LOGITS
Ah
0.71
Ah
0.68
ApJ
0.58
Ach
0.57
Ach
0.54
ah
0.54
AH
0.51
✥
0.50
Ahl
0.50
ah
0.50
Activations Density 0.011%