INDEX
Explanations
expressions of surprise or realization
"Oh" interjection
New Auto-Interp
Negative Logits
surla
-0.47
iële
-0.45
Amit
-0.39
Amic
-0.39
Abhishek
-0.39
Landmark
-0.38
estekak
-0.37
oise
-0.37
erdere
-0.37
kében
-0.36
POSITIVE LOGITS
Oh
1.26
Oh
1.20
oh
1.04
oh
1.02
Ohhh
0.82
Oooh
0.81
Ohh
0.80
Ohhhh
0.79
Ooh
0.75
OH
0.74
Activations Density 0.007%