INDEX
Explanations
statements and expressions of speech or attribution
New Auto-Interp
Negative Logits
Portail
-0.69
']")
-0.60
الحياه
-0.59
zejména
-0.56
Department
-0.56
Trabajo
-0.55
slou
-0.55
Verfügung
-0.55
Copia
-0.54
Ancak
-0.54
POSITIVE LOGITS
saying
1.60
Saying
1.59
SAY
1.58
say
1.56
SAY
1.55
saying
1.51
Saying
1.49
say
1.46
Say
1.45
Say
1.44
Activations Density 0.183%