INDEX
Explanations
references to personal or communal struggles and relationships
New Auto-Interp
Negative Logits
274
-0.15
VERRIDE
-0.15
ÑĥÑĢи
-0.14
ãĥ«ãĥī
-0.14
(State
-0.14
/Sub
-0.14
غاÙĦ
-0.14
/social
-0.14
ender
-0.14
specifier
-0.14
POSITIVE LOGITS
say
0.73
said
0.63
says
0.61
saying
0.60
say
0.58
说
0.57
Say
0.57
SAY
0.52
Say
0.51
說
0.48
Activations Density 0.402%