INDEX
Explanations
phrases related to actions or events taking place
actions or references to physical movement and separation
New Auto-Interp
Negative Logits
cel
-0.80
jc
-0.79
Chili
-0.79
vine
-0.79
Comm
-0.78
glomer
-0.78
Cel
-0.78
Brit
-0.74
rencies
-0.74
ascal
-0.74
POSITIVE LOGITS
out
1.14
OUT
1.08
outs
1.06
out
1.06
OUT
1.04
Out
0.99
Outs
0.98
outs
0.95
ay
0.93
Out
0.88
Activations Density 0.187%