INDEX
Explanations
occurrences of indefinite articles and descriptors for entities
New Auto-Interp
Negative Logits
sortie
-0.16
ảnh
-0.16
figure
-0.15
eck
-0.14
prise
-0.14
rick
-0.14
ajs
-0.14
recovery
-0.14
Prev
-0.14
inand
-0.14
POSITIVE LOGITS
lieu
0.17
laps
0.16
ifier
0.16
ifo
0.16
regain
0.15
passage
0.15
MBER
0.15
programme
0.14
sympt
0.14
Brightness
0.14
Activations Density 0.025%