INDEX
Explanations
expressions of hope and optimism
hope for help or positive outcome
New Auto-Interp
Negative Logits
domin
-0.44
ustimmung
-0.41
dominated
-0.40
"];
-0.40
<bos>
-0.40
reten
-0.39
Affiliations
-0.38
influ
-0.38
Custody
-0.38
onResume
-0.38
POSITIVE LOGITS
helps
0.65
helpful
0.58
estekak
0.56
Helps
0.56
help
0.56
ayude
0.55
ajuda
0.54
helps
0.54
helped
0.53
ayuda
0.52
Activations Density 0.012%