INDEX
Explanations
expressions of comparison and contrast between different subjects or ideas
New Auto-Interp
Negative Logits
sharp
-0.22
weakest
-0.15
shar
-0.15
sharpen
-0.15
sharply
-0.14
anybody
-0.14
happiest
-0.14
sharp
-0.14
quickest
-0.14
Sharp
-0.13
POSITIVE LOGITS
more
0.75
æĽ´
0.63
more
0.57
æĽ´åĬł
0.57
better
0.56
lebih
0.55
æĽ´
0.54
daha
0.53
greater
0.52
less
0.51
Activations Density 3.034%