INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.07
4:0.08
5:0.07
6:0.07
7:0.09
8:0.08
9:0.09
10:0.08
11:0.08
Negative Logits
Rogue
-3.20
Limit
-2.96
quished
-2.95
elsh
-2.78
DoS
-2.76
Slayer
-2.76
atche
-2.75
rikes
-2.70
rast
-2.65
ifted
-2.62
POSITIVE LOGITS
Femin
2.66
documenting
2.57
feminist
2.46
corrobor
2.44
Patri
2.43
models
2.42
Feminist
2.41
Easter
2.40
presentation
2.37
bore
2.36
Activations Density 0.000%
No Known Activations
This feature has no known activations.