INDEX
Explanations
adjectives and phrases related to rankings, positions, and comparisons
financial references involving numbers and measurements
New Auto-Interp
Negative Logits
\\\\\\\\
-0.66
>>>>>>>>
-0.62
citiz
-0.58
volunte
-0.57
friends
-0.57
untled
-0.57
Anonymous
-0.56
Rena
-0.55
Friends
-0.55
Pets
-0.54
POSITIVE LOGITS
omial
0.75
chords
0.74
theorem
0.73
rhy
0.73
decimal
0.72
arithmetic
0.72
integers
0.70
coefficients
0.70
syll
0.69
ratio
0.68
Activations Density 1.103%