INDEX
Explanations
mentions of different generations in a context
New Auto-Interp
Negative Logits
.jav
-0.16
/libs
-0.15
ining
-0.14
Fir
-0.14
Edu
-0.14
bites
-0.14
लà¤Ĺ
-0.14
lian
-0.14
bard
-0.14
ullo
-0.14
POSITIVE LOGITS
230
0.15
cono
0.15
jure
0.14
****************************************************************
0.14
é©
0.14
Freeman
0.13
Sommer
0.13
CB
0.13
inea
0.13
overe
0.13
Activations Density 0.005%