INDEX
Explanations
occurrences of the word "being" in various contexts
New Auto-Interp
Negative Logits
lse
-0.17
kü
-0.15
Beginning
-0.15
differing
-0.15
ácil
-0.15
ocator
-0.14
ulumi
-0.14
beginning
-0.14
remaining
-0.14
woke
-0.14
POSITIVE LOGITS
able
0.35
unable
0.27
ness
0.26
part
0.23
asked
0.21
Able
0.20
apart
0.20
told
0.20
offered
0.20
around
0.20
Activations Density 0.060%