INDEX
Explanations
phrases that refer to superlatives and rankings
the repeated use of the word "the."
New Auto-Interp
Negative Logits
illion
-0.74
partake
-0.72
assume
-0.70
ambo
-0.70
dale
-0.68
OTA
-0.67
FILE
-0.66
rand
-0.65
riot
-0.64
render
-0.64
POSITIVE LOGITS
easiest
1.14
strongest
0.99
safest
0.94
hardest
0.94
toughest
0.92
opposite
0.91
simplest
0.90
same
0.90
greatest
0.86
largest
0.85
Activations Density 0.072%