INDEX
Explanations
comparisons being made between different entities or concepts
instances of the word "comparison"
New Auto-Interp
Negative Logits
jong
-0.80
adia
-0.76
ignt
-0.75
der
-0.71
ieri
-0.70
msg
-0.69
ktop
-0.68
mic
-0.67
zyme
-0.67
carpet
-0.66
POSITIVE LOGITS
isons
0.93
comparisons
0.91
favorably
0.90
apples
0.81
comparison
0.80
comparing
0.79
Compare
0.77
compare
0.74
Osw
0.71
Compar
0.69
Activations Density 0.017%