INDEX
Explanations
phrases indicating comparison or cause and effect
phrases indicating increasing quantities or magnitude
New Auto-Interp
Negative Logits
istani
-0.87
ahime
-0.79
ioch
-0.76
edit
-0.73
aido
-0.71
iasco
-0.70
lease
-0.70
idon
-0.69
opol
-0.69
amon
-0.68
POSITIVE LOGITS
better
1.46
harder
1.45
worse
1.39
stronger
1.39
louder
1.36
clearer
1.35
quicker
1.32
greater
1.32
easier
1.31
happier
1.30
Activations Density 0.029%