INDEX
Explanations
instances of the verb "be" in various forms and contexts
New Auto-Interp
Negative Logits
doors
-0.19
oriously
-0.19
semble
-0.19
lights
-0.17
mans
-0.16
mediate
-0.16
rox
-0.16
soon
-0.16
recht
-0.16
hammad
-0.15
POSITIVE LOGITS
heading
0.28
aming
0.27
arded
0.27
ech
0.26
aded
0.25
anie
0.25
hest
0.25
acons
0.25
aver
0.24
eh
0.23
Activations Density 0.016%