INDEX
Explanations
instances of the character "L" or variants of it in the context it appears
New Auto-Interp
Head Attr Weights
0:0.10
1:0.13
2:0.03
3:0.04
4:0.06
5:0.20
6:0.10
7:0.01
8:0.11
9:0.12
10:0.03
11:0.03
Negative Logits
gas
-2.00
GI
-1.79
fix
-1.78
fixes
-1.74
Georg
-1.70
к
-1.66
circle
-1.58
-1.57
cha
-1.55
Soviet
-1.52
POSITIVE LOGITS
assetsadobe
1.69
Secondary
1.56
disav
1.52
affirmative
1.45
Privacy
1.38
Speech
1.34
GamerGate
1.33
quake
1.32
Twe
1.31
ripple
1.30
Activations Density 0.020%