INDEX
Explanations
terms related to regional identity and cultural references
New Auto-Interp
Head Attr Weights
0:0.05
1:0.03
2:0.13
3:0.04
4:0.19
5:0.07
6:0.03
7:0.03
8:0.13
9:0.16
10:0.05
11:0.03
Negative Logits
FML
-1.51
amp
-1.28
furt
-1.25
cod
-1.16
oder
-1.16
oğ
-1.16
ァ
-1.15
レ
-1.13
quer
-1.12
ndra
-1.10
POSITIVE LOGITS
shoulders
1.27
UGE
1.22
Annotations
1.17
gorilla
1.17
inately
1.14
spoilers
1.09
warts
1.09
sidx
1.06
welcome
1.03
galitarian
1.01
Activations Density 0.002%