INDEX
Explanations
phrases related to actions and activities
instances of people performing actions
New Auto-Interp
Negative Logits
orn
-0.76
chell
-0.71
iq
-0.64
è£ħ
-0.63
ior
-0.62
guiActiveUnfocused
-0.62
iu
-0.61
inct
-0.61
fter
-0.61
izon
-0.60
POSITIVE LOGITS
albeit
1.03
however
0.91
but
0.88
sir
0.81
though
0.81
meanwhile
0.75
huh
0.71
etc
0.69
moreover
0.68
but
0.64
Activations Density 0.463%