INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.08
1:0.08
2:0.08
3:0.06
4:0.08
5:0.07
6:0.07
7:0.09
8:0.07
9:0.08
10:0.08
11:0.09
Negative Logits
nep
-2.71
describ
-2.56
sim
-2.51
aiman
-2.42
aban
-2.41
inav
-2.38
rept
-2.38
pitted
-2.37
Siber
-2.37
753
-2.36
POSITIVE LOGITS
Leaks
2.80
Letter
2.65
Compliance
2.54
Hilbert
2.51
Fell
2.49
Grounds
2.43
Patent
2.39
————
2.36
briefs
2.30
ß
2.27
Activations Density 0.000%
No Known Activations
This feature has no known activations.