INDEX
Explanations
buttermilk", "but contains", "but not", "butch"
New Auto-Interp
Negative Logits
0.47
0.45
டன்
0.43
0.41
Neste
0.40
따라서
0.40
வே
0.40
0.39
0.38
0.38
POSITIVE LOGITS
termilk
0.74
ternut
0.71
dennoch
0.67
thole
0.63
nevertheless
0.60
rition
0.57
tery
0.54
nonetheless
0.54
তবুও
0.54
ters
0.54
Activations Density 0.007%