INDEX
Explanations
instances of the word "apply" within different contexts
New Auto-Interp
Negative Logits
watching
-0.70
footed
-0.66
ument
-0.64
nia
-0.64
hoff
-0.63
birds
-0.62
aired
-0.62
owship
-0.62
acters
-0.61
bill
-0.60
POSITIVE LOGITS
pressure
0.95
brakes
0.72
sunscreen
0.71
ogen
0.68
rigorous
0.67
arate
0.65
liber
0.65
lessons
0.64
lipstick
0.64
mathematical
0.63
Activations Density 0.045%