INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.09
1:0.09
2:0.08
3:0.07
4:0.07
5:0.07
6:0.08
7:0.06
8:0.08
9:0.08
10:0.08
11:0.09
Negative Logits
tro
-1.96
apore
-1.66
cele
-1.44
dragons
-1.43
lde
-1.42
cho
-1.41
oration
-1.38
Hera
-1.37
cheat
-1.36
wagen
-1.36
POSITIVE LOGITS
ONSORED
2.25
OCK
1.68
+++
1.65
Oliv
1.54
GGGGGGGG
1.54
++++++++++++++++
1.51
Ramirez
1.46
Gutierrez
1.46
ertodd
1.45
Rid
1.43
Activations Density 0.000%
No Known Activations
This feature has no known activations.