INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
Things
0.52
Any
0.52
Anything
0.52
Others
0.52
任何人
0.50
)-
0.50
Anyone
0.49
Others
0.49
things
0.48
غير
0.47
POSITIVE LOGITS
a
0.98
three
0.94
an
0.94
two
0.88
four
0.84
several
0.81
five
0.77
了一个
0.76
μια
0.75
ένα
0.74
Activations Density 4.788%