INDEX
Explanations
references to individual stakeholders and their perspectives or actions within a context
New Auto-Interp
Negative Logits
forgetting
-0.16
cherche
-0.15
chet
-0.14
illard
-0.14
ì°©
-0.14
rq
-0.14
.Manager
-0.13
εδ
-0.13
iginal
-0.13
phia
-0.13
POSITIVE LOGITS
feel
0.43
believe
0.40
feels
0.39
think
0.37
认为
0.36
perceive
0.35
believes
0.35
perception
0.35
feel
0.34
è§īå¾Ĺ
0.33
Activations Density 0.238%