INDEX
Explanations
cultural and linguistic terms from various languages
New Auto-Interp
Negative Logits
accrued
-0.86
increment
-0.77
Blair
-0.76
launch
-0.76
waiver
-0.76
resume
-0.75
resur
-0.73
timely
-0.73
retailers
-0.72
creep
-0.72
POSITIVE LOGITS
sic
1.87
pron
1.69
meaning
1.65
literally
1.64
formerly
1.46
aka
1.44
pictured
1.41
sometimes
1.34
also
1.33
see
1.31
Activations Density 0.097%