INDEX
Explanations
vary significantly, wildly, dramatically
New Auto-Interp
Negative Logits
yourself
1.02
Yourself
1.02
yourself
0.96
myself
0.95
careful
0.94
Careful
0.87
cuidadosamente
0.86
carefully
0.86
முழுவதும்
0.83
নিজেই
0.82
POSITIVE LOGITS
unexpectedly
1.01
unpredict
0.97
organically
0.92
itself
0.92
naturally
0.91
spontaneously
0.90
predictably
0.84
naturalmente
0.84
unchecked
0.82
عندي
0.82
Activations Density 0.553%