INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
.
1.59
↵
1.31
top
1.15
1
1.10
ta
1.08
(
1.05
ts
1.03
1.02
2
1.01
t
0.97
POSITIVE LOGITS
ayatan
2.08
kprop
2.05
pptn
1.99
vehement
1.98
EnglishMarks
1.97
anisot
1.96
Quels
1.95
쾺
1.95
Hitpoint
1.94
vitth
1.91
Activations Density 0.000%
No Known Activations
This feature has no known activations.