INDEX
Explanations
praiseworthy expressions related to accomplishment or performance
positive affirmations and expressions of encouragement
New Auto-Interp
Negative Logits
minent
-0.77
abel
-0.75
ushima
-0.73
erous
-0.67
URA
-0.66
Ń·
-0.65
umat
-0.65
ierrez
-0.65
agen
-0.64
agin
-0.64
POSITIVE LOGITS
luck
1.26
job
1.19
thing
1.12
bye
1.11
grief
1.02
timing
0.98
luck
0.97
lord
0.96
idea
0.96
guy
0.92
Activations Density 0.093%