INDEX
Explanations
occurrences of specific proper nouns or formal titles
New Auto-Interp
Negative Logits
محفوظة
-0.57
régler
-0.45
外部リンク
-0.45
kepentingan
-0.45
kepem
-0.44
المعيارى
-0.43
الوطنيه
-0.42
üzere
-0.42
WriteTagHelper
-0.42
čin
-0.42
POSITIVE LOGITS
said
1.03
explained
0.93
says
0.90
ThroughAttribute
0.85
dijo
0.84
told
0.83
commented
0.82
recalled
0.82
said
0.81
spokeswoman
0.80
Activations Density 0.123%