INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.07
2:0.08
3:0.09
4:0.07
5:0.08
6:0.07
7:0.08
8:0.09
9:0.09
10:0.08
11:0.08
Negative Logits
iverse
-1.84
rity
-1.61
darn
-1.61
faithful
-1.57
!",
-1.57
ngth
-1.55
!.
-1.54
.",
-1.52
misc
-1.52
loyal
-1.51
POSITIVE LOGITS
Atkins
1.69
Weston
1.68
Morse
1.66
Nielsen
1.62
Morgan
1.57
Shark
1.57
Schne
1.56
�
1.56
stadt
1.55
Quake
1.54
Activations Density 0.000%
No Known Activations
This feature has no known activations.