INDEX
Explanations
phrases indicating reports or statements made by individuals
New Auto-Interp
Negative Logits
說話
-0.55
мәкал
-0.54
nocturno
-0.47
kasarigan
-0.45
brazos
-0.45
Aufmerksamkeit
-0.45
považ
-0.44
说话
-0.44
blames
-0.44
dflare
-0.44
POSITIVE LOGITS
hinted
0.58
indicated
0.58
implied
0.56
intimated
0.56
indicated
0.54
所示
0.50
demonstrated
0.48
показа
0.48
predicted
0.46
suggested
0.45
Activations Density 0.420%