INDEX
Explanations
references to young people in various contexts
New Auto-Interp
Head Attr Weights
0:0.02
1:0.01
2:0.05
3:0.32
4:0.02
5:0.02
6:0.12
7:0.14
8:0.02
9:0.05
10:0.08
11:0.08
Negative Logits
ーテ
-1.31
ロ
-1.27
Nieto
-1.26
ーティ
-1.21
etsk
-1.21
ット
-1.21
ャ
-1.19
ィ
-1.18
abilia
-1.17
pta
-1.14
POSITIVE LOGITS
disillusion
1.05
enson
1.04
reckoning
1.02
ju
1.01
touring
1.00
awa
0.98
RELE
0.98
indoctr
0.98
graduation
0.97
seniors
0.97
Activations Density 0.026%