INDEX
Explanations
phrases or sentences with emphasis on a particular quality or aspect
demonstratives and references to specific concepts or ideas
New Auto-Interp
Negative Logits
Footnote
-0.79
letters
-0.72
OUND
-0.66
AUD
-0.66
wcsstore
-0.65
Voices
-0.63
aws
-0.62
Altern
-0.62
excerpts
-0.62
YR
-0.61
POSITIVE LOGITS
much
1.22
way
1.14
far
0.99
much
0.97
cheaply
0.95
badly
0.93
easily
0.93
magnitude
0.89
MUCH
0.89
kind
0.87
Activations Density 0.090%