INDEX
Explanations
references to historical figures, particularly focusing on their lineage or titles
bin or ibn or Mac followed by a name
New Auto-Interp
Negative Logits
ProtoMessage
-0.52
expandindo
-0.39
ReusableCell
-0.37
ſta
-0.36
referrerpolicy
-0.35
foglal
-0.35
ſtate
-0.34
Etc
-0.34
collectively
-0.34
bonté
-0.33
POSITIVE LOGITS
bin
0.94
ibn
0.85
Bin
0.83
Ibn
0.79
ابن
0.75
BIN
0.74
Ibn
0.74
bint
0.72
Bin
0.71
بن
0.66
Activations Density 0.012%