INDEX
Explanations
references to lip movements or lip service
New Auto-Interp
Negative Logits
Leap
-0.84
ISION
-0.83
IRD
-0.75
Samar
-0.68
theless
-0.67
Blessed
-0.67
ENCY
-0.66
NESS
-0.66
ESV
-0.65
Reloaded
-0.65
POSITIVE LOGITS
sticks
1.06
atures
0.98
stick
0.97
ograph
0.95
ids
0.94
ogenesis
0.94
seys
0.93
ogen
0.91
etsk
0.90
ograms
0.89
Activations Density 0.008%