INDEX
Explanations
phrases or words related to comparisons or levels of importance
phrases that express a sense of increase or enhancement
New Auto-Interp
Negative Logits
crow
-0.86
breaks
-0.82
raid
-0.81
guard
-0.81
books
-0.77
pta
-0.76
washing
-0.75
tops
-0.74
bed
-0.73
elf
-0.71
POSITIVE LOGITS
than
1.07
appreciation
1.03
than
0.89
importance
0.87
pains
0.86
abundance
0.86
heights
0.84
insight
0.83
amounts
0.82
quantities
0.82
Activations Density 0.017%