INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.09
3:0.07
4:0.08
5:0.08
6:0.08
7:0.09
8:0.07
9:0.08
10:0.08
11:0.09
Negative Logits
arius
-2.97
ム
-2.92
achine
-2.89
ridor
-2.88
veyard
-2.87
ヴ
-2.86
ocy
-2.83
early
-2.75
numbered
-2.74
aiden
-2.74
POSITIVE LOGITS
puppet
3.14
AMA
3.07
pupp
2.78
Million
2.76
Knicks
2.75
Dude
2.67
newsp
2.58
TMZ
2.55
Silk
2.54
Manip
2.54
Activations Density 0.000%
No Known Activations
This feature has no known activations.