INDEX
Explanations
references to specific lines or linear representations in a document
New Auto-Interp
Negative Logits
Campbell
-0.43
superuser
-0.43
mutlich
-0.42
poroz
-0.42
Scha
-0.41
desay
-0.41
Great
-0.41
Balth
-0.41
appspot
-0.41
Erst
-0.40
POSITIVE LOGITS
Line
1.33
Line
1.30
line
1.23
LINE
1.23
line
1.20
LINE
1.15
Lines
1.04
Lines
1.03
lines
1.02
lines
0.98
Activations Density 0.182%