INDEX
Explanations
complex lexical structures related to social dynamics and political contexts
New Auto-Interp
Head Attr Weights
0:0.04
1:0.01
2:0.11
3:0.15
4:0.29
5:0.03
6:0.04
7:0.04
8:0.08
9:0.06
10:0.05
11:0.05
Negative Logits
reminded
-1.79
terday
-1.61
田
-1.58
enos
-1.57
76561
-1.57
urai
-1.56
soon
-1.56
��
-1.50
�醒
-1.47
uyomi
-1.46
POSITIVE LOGITS
altogether
1.66
ones
1.62
equivalents
1.51
nor
1.41
originals
1.40
darts
1.34
actual
1.30
pills
1.27
rud
1.26
conventional
1.26
Activations Density 0.044%