INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.06
2:0.08
3:0.09
4:0.09
5:0.09
6:0.07
7:0.08
8:0.09
9:0.07
10:0.09
11:0.07
Negative Logits
Kardash
-1.97
Kardashian
-1.94
tabl
-1.92
cele
-1.91
headlines
-1.89
rumors
-1.88
celebrities
-1.84
Oprah
-1.82
medications
-1.81
slogans
-1.80
POSITIVE LOGITS
annot
2.23
adra
1.98
REM
1.95
inet
1.89
ateur
1.87
adapt
1.87
vere
1.82
yth
1.74
very
1.73
ogh
1.72
Activations Density 0.000%
No Known Activations
This feature has no known activations.