INDEX
Explanations
instances of positive evaluation or performance feedback
the occurrence of the word "and"
New Auto-Interp
Negative Logits
actionDate
-0.94
payer
-0.86
ictionary
-0.86
itarian
-0.81
insula
-0.75
Enlarge
-0.74
Pub
-0.72
orno
-0.72
ensor
-0.71
âĸĪâĸĪ
-0.71
POSITIVE LOGITS
secondly
0.96
luckily
0.89
teammate
0.88
hopefully
0.88
consequently
0.87
thus
0.87
finishing
0.86
exce
0.86
versatility
0.86
scored
0.85
Activations Density 0.238%