INDEX
Explanations
the letters 'ib' followed by a number
instances of the substring "ib" in various contexts
New Auto-Interp
Negative Logits
ISTER
-0.67
Defender
-0.65
CHO
-0.63
CPS
-0.63
Russ
-0.63
Bender
-0.63
Blizzard
-0.62
Ceres
-0.62
seriousness
-0.61
||||
-0.60
POSITIVE LOGITS
ilib
1.34
ibl
1.08
ulous
1.07
ib
1.01
raltar
0.98
acter
0.97
uilt
0.97
rahim
0.93
odies
0.90
uana
0.90
Activations Density 0.010%