INDEX
Explanations
information and references from Wikipedia
New Auto-Interp
Negative Logits
charism
-0.75
pter
-0.74
cffffcc
-0.73
stra
-0.71
rone
-0.69
sbm
-0.69
taboola
-0.69
Bethlehem
-0.68
eping
-0.68
ayed
-0.65
POSITIVE LOGITS
ipedia
1.38
Commons
1.13
encyclopedia
1.00
pedia
0.98
Wikipedia
0.93
wiki
0.89
Leaks
0.88
Wikipedia
0.86
Template
0.85
edits
0.84
Activations Density 0.009%