INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.08
2:0.09
3:0.09
4:0.10
5:0.07
6:0.08
7:0.08
8:0.08
9:0.06
10:0.09
11:0.07
Negative Logits
retweet
-1.75
Spotify
-1.60
Macron
-1.57
fruitful
-1.56
Discover
-1.45
thanking
-1.43
iTunes
-1.41
revealing
-1.41
Medium
-1.40
"],"
-1.40
POSITIVE LOGITS
rag
1.74
��
1.66
��
1.54
Scand
1.53
gas
1.49
�
1.49
mare
1.48
angular
1.47
神
1.43
awa
1.42
Activations Density 0.000%
No Known Activations
This feature has no known activations.