INDEX
Explanations
words related to large amounts, intensity, or significance
New Auto-Interp
Negative Logits
©¶æ
-0.91
crow
-0.87
acus
-0.78
ades
-0.77
ramid
-0.77
arij
-0.76
ney
-0.75
illet
-0.73
arers
-0.72
EA
-0.72
POSITIVE LOGITS
amount
1.13
importance
1.12
amounts
1.09
wealth
1.04
hardship
1.02
pleasure
0.96
riches
0.95
gratitude
0.94
quantities
0.93
hardships
0.92
Activations Density 0.056%