INDEX
Explanations
specific names and influential figures in various contexts
New Auto-Interp
Negative Logits
_taken
-0.14
λά
-0.13
Thrown
-0.13
risen
-0.13
swers
-0.13
lain
-0.13
리ì§Ģ
-0.13
spans
-0.13
theres
-0.12
thrown
-0.12
POSITIVE LOGITS
was
0.29
did
0.28
had
0.28
gave
0.26
began
0.26
took
0.26
didn
0.25
has
0.23
came
0.23
went
0.23
Activations Density 1.643%