INDEX
Explanations
phrases and terms related to ongoing situations or political contexts
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.09
3:0.04
4:0.02
5:0.03
6:0.30
7:0.06
8:0.04
9:0.04
10:0.06
11:0.23
Negative Logits
itely
-1.20
oops
-1.20
undai
-1.10
ifferent
-1.08
goats
-1.07
angan
-1.05
imb
-1.03
ones
-1.03
gif
-1.03
holiest
-1.02
POSITIVE LOGITS
��
1.23
��
1.21
Classification
1.15
機
1.11
assets
1.05
statutes
1.05
ablishment
1.03
reactive
1.03
netflix
1.03
rane
1.03
Activations Density 0.055%