INDEX
Explanations
opening phrases that indicate the start of a discussion or narrative
New Auto-Interp
Negative Logits
ributor
-0.53
िखित
-0.51
باخ
-0.48
pères
-0.48
classified
-0.46
áže
-0.44
vícti
-0.44
chapper
-0.43
Unclassified
-0.43
újo
-0.43
POSITIVE LOGITS
however
0.94
però
0.81
however
0.76
όμως
0.74
However
0.73
toutefois
0.73
However
0.73
azonban
0.71
entanto
0.70
jednak
0.69
Activations Density 0.516%