INDEX
Explanations
No Explanations Found
New Auto-Interp
Head Attr Weights
0:0.07
1:0.07
2:0.07
3:0.08
4:0.10
5:0.07
6:0.09
7:0.08
8:0.08
9:0.07
10:0.08
11:0.07
Negative Logits
Sutherland
-2.69
Olson
-2.61
Feld
-2.57
Freedom
-2.55
Shelby
-2.54
newfound
-2.54
Pacers
-2.42
newly
-2.42
Epstein
-2.41
Robertson
-2.39
POSITIVE LOGITS
respawn
2.97
spam
2.58
hus
2.50
td
2.50
yo
2.46
@@
2.44
pup
2.37
в
2.34
hrs
2.31
lab
2.30
Activations Density 0.000%
No Known Activations
This feature has no known activations.