INDEX
Explanations
references to praise or positive acknowledgment
New Auto-Interp
Negative Logits
transQ
-0.70
GEBURTSDATUM
-0.67
Geplaatst
-0.62
kyl
-0.60
Portail
-0.57
Answer
-0.56
.*")]
-0.56
Frust
-0.54
desaf
-0.54
Grim
-0.53
POSITIVE LOGITS
praise
2.09
praising
1.94
praises
1.93
praised
1.77
admiration
1.65
Praise
1.62
compliments
1.60
praise
1.59
compliment
1.59
admiring
1.47
Activations Density 0.224%