INDEX
Explanations
phrases involving coordinating conjunctions such as "and"
conjunctions that connect thoughts or ideas
New Auto-Interp
Negative Logits
millenn
-0.68
Azerb
-0.55
Fed
-0.55
Palestin
-0.54
Magikarp
-0.54
depos
-0.53
ilty
-0.52
isolation
-0.52
WORK
-0.52
icity
-0.52
POSITIVE LOGITS
rogen
1.50
rogens
1.42
ro
1.07
then
1.03
romeda
1.02
rew
0.97
rology
0.92
rea
0.89
ros
0.88
rost
0.85
Activations Density 0.173%