INDEX
Explanations
repetitive phrases and expressions of frustration or emphasis
New Auto-Interp
Negative Logits
kano
-0.86
Hale
-0.73
".$_
-0.69
Purdy
-0.67
ZI
-0.66
Hern
-0.64
fers
-0.63
$'
-0.63
Swain
-0.62
=>"
-0.62
POSITIVE LOGITS
again
1.60
Again
1.58
Again
1.56
again
1.56
AGAIN
1.50
AGAIN
1.46
igjen
1.20
Lagi
1.07
novamente
1.06
wieder
1.05
Activations Density 0.045%