INDEX
Explanations
references to significant news events or figures
New Auto-Interp
Negative Logits
Extra
-0.15
.undefined
-0.14
ãĥĥãĤ«ãĥ¼
-0.13
Extras
-0.13
ãĥ³ãĥĩãĤ£
-0.12
Current
-0.12
Miscellaneous
-0.12
Simple
-0.12
avour
-0.12
vůbec
-0.12
POSITIVE LOGITS
Isn
0.23
Didn
0.23
Could
0.23
Couldn
0.22
Wouldn
0.22
Finally
0.20
Doesn
0.20
vs
0.20
Was
0.20
Became
0.20
Activations Density 0.197%