INDEX
Explanations
occurrences of the word "being" in various contexts
New Auto-Interp
Negative Logits
quan
-0.17
unger
-0.15
Horse
-0.15
horse
-0.15
enstein
-0.14
ximo
-0.14
ldkf
-0.13
bersome
-0.13
acks
-0.13
adients
-0.13
POSITIVE LOGITS
irut
0.15
بت
0.15
atus
0.14
spite
0.13
worn
0.13
ovenant
0.13
Doch
0.13
uiltin
0.13
Structured
0.13
omy
0.13
Activations Density 0.052%