INDEX
Explanations
brain-related terms, particularly variations of the word "rad" and its derivatives
New Auto-Interp
Negative Logits
iw
-0.17
ffset
-0.17
ably
-0.16
i
-0.16
и
-0.15
ystone
-0.15
eel
-0.15
iya
-0.15
antro
-0.15
課
-0.14
POSITIVE LOGITS
cliffe
0.25
isson
0.22
ishes
0.21
icals
0.20
ziej
0.19
olph
0.19
olf
0.18
Rad
0.18
elaide
0.18
ical
0.18
Activations Density 0.010%