INDEX
Explanations
instances of the word "know" and its variations, indicating a focus on knowledge and awareness
New Auto-Interp
Negative Logits
वत
-0.15
quist
-0.14
سات
-0.14
faction
-0.14
ãĥ¥
-0.13
ières
-0.13
eward
-0.13
uman
-0.12
ipar
-0.12
anst
-0.12
POSITIVE LOGITS
about
0.40
how
0.34
ledged
0.32
-how
0.32
what
0.30
enough
0.29
who
0.29
about
0.28
led
0.28
lege
0.27
Activations Density 0.113%