INDEX
Explanations
phrases that convey optimism about personal achievements and capacities
New Auto-Interp
Negative Logits
__':
-0.65
__":
-0.65
SequentialGroup
-0.62
Pingback
-0.62
νώ
-0.62
насељу
-0.61
חיצוניים
-0.58
IsContent
-0.57
apiClient
-0.57
estekak
-0.55
POSITIVE LOGITS
work
0.77
thing
0.70
job
0.70
deed
0.61
insee
0.60
homework
0.58
damage
0.57
job
0.57
role
0.56
的事情
0.56
Activations Density 0.145%