INDEX
Explanations
positive words related to virtues or benefits
references to positive qualities or actions
New Auto-Interp
Negative Logits
eters
-0.79
ptin
-0.73
Pavilion
-0.71
kson
-0.71
ocene
-0.70
pper
-0.70
âĹ¼
-0.67
pora
-0.67
eteria
-0.67
Strait
-0.66
POSITIVE LOGITS
enough
1.21
reads
1.07
Samar
1.00
luck
0.99
deed
0.96
intentions
0.96
deeds
0.93
NESS
0.87
sword
0.87
ol
0.87
Activations Density 0.069%