INDEX
Explanations
references to the middle layer or middle-aged concepts
New Auto-Interp
Negative Logits
autorytatywna
-0.49
ſy
-0.47
matel
-0.46
Majefty
-0.45
pouvoit
-0.45
étoit
-0.45
Gouver
-0.44
suspen
-0.43
ſol
-0.43
pleaſure
-0.43
POSITIVE LOGITS
中
0.76
middle
0.75
中
0.73
closest
0.71
Middle
0.68
middle
0.66
Middle
0.65
MIDDLE
0.65
tengah
0.65
reason
0.63
Activations Density 1.692%