INDEX
Explanations
phrases involving assistance or collaboration
instances of the word "help" and related concepts like assistance
New Auto-Interp
Negative Logits
ILLE
-0.72
olver
-0.68
skelet
-0.66
Tone
-0.65
relevance
-0.64
oggles
-0.61
pigeon
-0.59
iolet
-0.58
rall
-0.58
mysteries
-0.57
POSITIVE LOGITS
CG
0.77
rador
0.67
forts
0.67
owitz
0.66
gypt
0.66
Lama
0.64
ç
0.61
aries
0.61
ulator
0.60
ci
0.60
Activations Density 0.054%