INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.06
2:0.08
3:0.09
4:0.10
5:0.08
6:0.08
7:0.08
8:0.06
9:0.08
10:0.07
11:0.09
Negative Logits
chat
-1.83
awk
-1.63
dropping
-1.62
includ
-1.61
zie
-1.60
ongh
-1.59
aband
-1.58
arus
-1.57
acho
-1.55
zhen
-1.52
POSITIVE LOGITS
BLIC
1.69
gradient
1.65
tert
1.57
Kenyan
1.57
Albert
1.49
erenn
1.49
VERT
1.49
GREEN
1.46
Greenwich
1.46
limestone
1.44
Activations Density 0.000%
No Known Activations
This feature has no known activations.