INDEX
Explanations
the word "be" at high activations, indicating a focus on statements of existence or identification
instances of the verb "be" in various contexts
New Auto-Interp
Negative Logits
illuminate
-0.69
anwhile
-0.69
imize
-0.67
rones
-0.67
ipeg
-0.66
rouse
-0.66
culosis
-0.65
ãĥ¼ãĥ«
-0.64
borrow
-0.64
ppings
-0.64
POSITIVE LOGITS
able
0.99
leeve
0.87
regarded
0.83
hemoth
0.82
heading
0.81
getting
0.81
considered
0.78
fits
0.77
influenced
0.74
fit
0.74
Activations Density 0.105%