INDEX
Explanations
pronouns followed by actions
New Auto-Interp
Negative Logits
Notably
1.11
较为
1.08
にて
1.00
Notably
0.98
এরূপ
0.97
poiché
0.95
نیز
0.93
oldukça
0.92
hehe
0.91
較
0.89
POSITIVE LOGITS
throw
0.90
sitting
0.88
wanna
0.87
look
0.85
throws
0.84
াম্প
0.81
sit
0.81
surround
0.78
sat
0.77
threw
0.76
Activations Density 0.078%