INDEX
Explanations
names of people or places
references to groups or collectives, particularly in relation to animals or populations
New Auto-Interp
Negative Logits
peel
-0.77
cyclopedia
-0.75
decomp
-0.74
FK
-0.74
ructose
-0.74
capacitor
-0.72
lear
-0.67
ibble
-0.67
Agent
-0.65
enthus
-0.65
POSITIVE LOGITS
į
0.83
Rasm
0.83
hammad
0.80
Mehran
0.73
angelo
0.73
Kirin
0.73
herds
0.72
Lah
0.70
Lars
0.70
Karin
0.69
Activations Density 0.029%