INDEX
Explanations
evaluative statements about actions or situations indicating their complexity or impact
New Auto-Interp
Negative Logits
ongevity
-0.65
inite
-0.59
venants
-0.57
oreal
-0.56
adium
-0.55
hens
-0.53
ioxide
-0.53
arnaev
-0.52
issions
-0.52
ensions
-0.51
POSITIVE LOGITS
!).
1.04
!),
0.84
).
0.83
?).
0.81
anyway
0.80
!)
0.78
euphem
0.75
)}
0.72
)</
0.72
considering
0.71
Activations Density 0.344%