INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.07
4:0.09
5:0.06
6:0.07
7:0.10
8:0.08
9:0.07
10:0.08
11:0.09
Negative Logits
screenings
-1.76
misplaced
-1.64
sneaking
-1.59
Revelations
-1.58
lies
-1.57
locating
-1.55
Homeland
-1.54
Grande
-1.52
unfounded
-1.52
Awakening
-1.46
POSITIVE LOGITS
mbuds
2.13
oun
1.97
�
1.88
edom
1.83
undrum
1.81
Leilan
1.79
arbon
1.78
otom
1.68
netflix
1.66
────
1.66
Activations Density 0.000%
No Known Activations
This feature has no known activations.