INDEX
Explanations
phrases related to dropping something or someone off
New Auto-Interp
Negative Logits
eur
-0.88
iola
-0.71
Wan
-0.66
gregation
-0.65
riors
-0.64
urally
-0.64
eering
-0.64
ãĢij
-0.63
Tang
-0.62
apo
-0.62
POSITIVE LOGITS
kick
1.12
jaws
0.93
hints
0.93
down
0.88
bombs
0.79
leaflets
0.79
acid
0.77
bombshell
0.77
down
0.75
dime
0.74
Activations Density 0.523%