INDEX
Explanations
contractions and forms of "to have" indicating past experiences or states
New Auto-Interp
Negative Logits
ROY
-0.14
rex
-0.14
yll
-0.14
roy
-0.13
yz
-0.13
y
-0.13
.cbo
-0.13
AWN
-0.13
aki
-0.13
-bound
-0.13
POSITIVE LOGITS
previously
0.19
ipple
0.17
Previously
0.17
Ìī
0.16
plx
0.16
imens
0.16
Previously
0.15
reno
0.15
ÑĢеÑģ
0.15
CHASE
0.15
Activations Density 0.289%