INDEX
Explanations
abstract concepts or ideas
occurrences of the word "abstract" and related concepts
New Auto-Interp
Negative Logits
risome
-0.81
ICAN
-0.79
odder
-0.76
UNCH
-0.72
omore
-0.70
ods
-0.66
attering
-0.64
hiba
-0.63
artney
-0.62
unker
-0.62
POSITIVE LOGITS
ions
1.01
edly
0.95
edIn
0.92
stract
0.90
furt
0.84
algebra
0.82
¥µ
0.80
aby
0.79
matter
0.78
syntax
0.77
Activations Density 0.012%