INDEX
Explanations
the definite article introducing specific nouns
New Auto-Interp
Negative Logits
both
0.81
both
0.76
outweighed
0.74
BOTH
0.71
거짓
0.67
eduanya
0.67
btw
0.66
ample
0.65
outweighs
0.63
ReferencePose
0.62
POSITIVE LOGITS
earliest
1.02
oldest
0.93
Great
0.92
annual
0.91
famous
0.88
University
0.87
Society
0.86
Société
0.85
original
0.85
Senate
0.83
Activations Density 0.119%