INDEX
Explanations
expressions of curiosity or questioning
New Auto-Interp
Negative Logits
Datuak
-0.76
Astoria
-0.67
paksa
-0.63
IRQn
-0.62
Alba
-0.61
caloosa
-0.60
aroa
-0.60
alba
-0.57
vajal
-0.56
Industri
-0.56
POSITIVE LOGITS
wonder
1.76
wondering
1.76
Wonder
1.70
wonder
1.70
Wonder
1.64
WONDER
1.61
wondered
1.52
Wondering
1.40
wonders
1.36
Wonders
1.28
Activations Density 0.066%