INDEX
Explanations
phrases related to taking action
occurrences of the word "strike" in various contexts
New Auto-Interp
Negative Logits
regon
-0.73
ettel
-0.71
ŀ
-0.69
umption
-0.69
glomer
-0.66
icter
-0.65
ophers
-0.65
raints
-0.65
etsk
-0.65
opal
-0.65
POSITIVE LOGITS
strike
1.02
strikes
0.84
striking
0.84
breakers
0.82
struck
0.81
ters
0.80
force
0.80
strike
0.75
ting
0.74
collar
0.71
Activations Density 0.012%