INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.09
3:0.07
4:0.08
5:0.07
6:0.07
7:0.09
8:0.09
9:0.08
10:0.08
11:0.09
Negative Logits
Vanity
-1.82
Blanc
-1.80
aloud
-1.73
teammate
-1.67
Ki
-1.66
Cay
-1.61
Nu
-1.60
Sirius
-1.57
Peterson
-1.56
Tier
-1.55
POSITIVE LOGITS
rob
2.00
hyde
1.95
ウス
1.90
strap
1.78
WARE
1.76
erial
1.71
aunders
1.69
control
1.69
fuck
1.66
mone
1.63
Activations Density 0.000%
No Known Activations
This feature has no known activations.